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(54) Title: PROTEIN MODIFICATION AND MAINTENANCE MOLECULES 

(57) Abstract: Various embodiments of the invention provide human protein modification and maintenance molecules (PMMM) 
and polynucleotides which identify and encode PMMM. Embodiments of the invention also provide expression vectors, host cells, 
antibodies, agonists, and antagonists. Other embodiments provide methods for diagnosing, treating, or preventing disorders associ- 
ated with aberrant expression of PMMM.a. 
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phosphorylates tyrosine residues, and the other class, protein serine/threonine kinases (STKs), 
phosphorylates serine and threonine residues. Some PTKs and STKs possess structural 
characteristics of both families and have dual specificity for both tyrosine and serine/threonine 
residues. Almost all kinases contain a conserved 250-300 amino acid catalytic domain 
5 containing specific residues and sequence motifs characteristic of the kinase family. (Reviewed 
inHardie, G. and S. Hanks (1995) The Protein Kinase Facts Book, Vol I, Academic Press, San 
Diego, CA,pp. 17-20). 
Phosphatases 

Phosphatases hydrolytically remove phosphate groups from proteins. Phosphatases are 
10 essential in deter mining the extent of phosphorylation in the cell and, together with kinases, 

regulate key cellular processes such as metabolic enzyme activity, proliferation, cell growth and 
differentiation, cell adhesion, and cell cycle progression. Protein phosphatases are characterized 
as either serine/threonine- or tyrosine-specific based on their preferred phospho-amino acid 
substrate. Some phosphatases (DSPs, for dual specificity phosphatases) can act on 
15 phosphorylated tyrosine, serine, or threonine residues. The protein serine/threonine phosphatases 
(PSPs) are important regulators of many cAMP-mediated hormone responses in cells. Protein 
tyrosine phosphatases (PTPs) play a significant role in cell cycle and cell signaling processes. 
Proteases 

Proteases cleave proteins and peptides at the peptide bond that forms the backbone of the 
20 protein or peptide chain. Proteolysis is one of the most important and frequent enzymatic 

reactions that occurs both within and outside of cells. Proteolysis is responsible for the activation 
and maturation of nascent polypeptides, the degradation of misfolded and damaged proteins, and 
the controlled turnover of peptides within the cell. Proteases participate in digestion, endocrine 
function, tissue remodeling during embryonic development, wound healing, and normal growth. 
25 Proteases can play a role in regulatory processes by affecting the half life of regulatory proteins. 
Proteases are involved in the etiology or progression of disease states such as inflammation, 
angiogenesis, tumor dispersion and metastasis, cardiovascular disease, neurological disease, and 
bacterial, parasitic, and viral infections. 

Proteases can be categorized on the basis of where they cleave their substrates. 
30 Exopeptidases, which include aminopeptidases, dipeptidyl peptidases, tripeptidases, 

carboxypeptidases, peptidyl-di-peptidases, dipeptidases, and omega peptidases, cleave residues at 
the termini of their substrates. Endopeptidases, including serine proteases, cysteine proteases, 
and metalloproteases, cleave at residues within the peptide. Four principal categories of 
mammalian proteases have been identified based on active site structure, mechanism of action, 
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(PROSITE PDOC00376). Apple domains are involved in protein-protein interactions. S 1 family 
members include trypsin, chymotrypsin, coagulation factors Dt-XII, complement factors B, C, 
and D, granzymes, kallikrein, and tissue- and urokinase-plasminogen activators. The subtilisin 
family has members found in the eubacteria, archaebacteria, eukaryotes, and viruses. Subtilisins 

5 include the proprotein-processing endopeptidases kexin and furin and the pituitary prohormone 
convertases PCI, PC2, PC3, PC6, andPACE4 (Rawlings and Barrett, supra). 

SPs have functions in many normal processes and some have been implicated in the 
etiology or treatment of disease. Enterokinase, the initiator of intestinal digestion, is found in the 
intestinal brush border, where it cleaves the acidic propeptide from trypsinogen to yield active 

10 trypsin (Kitamoto, Y. et al. (1994) Proc. Natl. Acad Sci. USA 91:7588-7592). 

Prolylcarboxypeptidase, a lysosomal serine peptidase that cleaves peptides such as angiotensin II 
and III and [des-Arg9] bradykinin, shares sequence homology with members of both the serine 
carboxypeptidase and prolylendopeptidase families (Tan, F. et al. (1993) J. Biol. Chem. 
268:16631-16638). The protease neuropsin may influence synapse formation and neuronal 

15 connectivity in the hippocampus in response to neural signaling (Chen, Z.-L. et al. (1995) J. 

Neurosci. 15:5088-5097). Tissue plasminogen activator is useful for acute management of stroke 
(Zivin, J. A (1999) Neurology 53:14-19) and myocardial infarction (Ross, AM. (1999) Clin. 
Cardiol. 22:165-171). Some receptors (PAR, for proteinase-activated receptor), highly expressed 
throughout the digestive tract, are activated by proteolytic cleavage of an extracellular domain. 

20 The major agonists for PARs, thrombin, trypsin, and mast cell tryptase, are released in allergy 
and inflammatory conditions. Control of PAR activation by proteases has been suggested as a 
promising therapeutic target (Vergnolle, N. (2000) Aliment. Pharmacol. Ther. 14:257-266; Rice, 
K.D. et al. (1998) Curr. Pharm. Des. 4:381-396). Prostate-specific antigen (PSA) is akaHikrein- 
like serine protease synthesized and secreted exclusively by epithelial cells in the prostate gland. 

25 Serum PSA is elevated in prostate cancer and is the most sensitive physiological marker for 

monitoring cancer progression and response to therapy. PSA can also identify the prostate as the 
origin of a metastatic tumor (Brawer, M.K. and P.H. Lange (1989) Urology 33:1 1-16). 

The signal peptidase is a specialized class of SP found in all prokaryotic and eukaryotic 
cell types that serves in the processing of signal peptides from certain proteins. Signal peptides 

30 are amino-tenninal domains of a protein which direct the protein from its ribosomal assembly site 
to a particular cellular or extracellular location Once the protein has been exported, removal of 
the signal sequence by a signal peptidase and posttranslational processing, e.g., glycosylation or 
phosphorylation, activate the protein Signal peptidases exist as multi-subunit complexes in both 
yeast and mammals. The canine signal peptidase complex is composed of five subunits, all 
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assorted Wath the Acrosomal membrane and containing hydrophobic repons mat span the 
membrane one or more times (Shelness, G.S. and G. Blobel (1990) J. Biol. Chem 265 9512- 
9519). Some of these subunits serve to fix the complex in its proper position on the membrane 
while others contain the actual catalytic activity. 
5 ™ e ^^ for * e ^°cauonp^ 

N-terminal signal peptide on the elongating protein. Tne signal peptide dhects the pK ,tein and 
attached nbosome to a receptor on the ERxnembrane. Ibe polypeptide chain passes through a po re 
-^ERmen.branemtomelumenwMleme N-terminal signal peptide rema ins attached at the 
membranesurface. The process is completed when signal peptidase located inside the ER cleaves the 
10 signal peptide from the protein and releases the protein into the lumen 

Thrombin is a serine protease wxth an essential role in the process of blood coagulation 
Prothrombin, synthesized in the liver, is converted to active thrombin by Factor X, Activated 
tembmmen cleaves soluble fibrmogentopolymer-^ 

Wood clots, in addition, thrombin activates Factor XHIa, which plays a role in cross-linking 
:5 fibrin. «u-uug 

rostduo — peptide tan protesse-nctivated raptor 1 (PAR-1), formerly to™ „ 
to thrombin receptor. The cleavage of the amitto-terorinal peptide exposes a new amino 
tenmnus and may also be associated with PAR-1 internalization (Sntbbs, M.T. and W Bode 
» (1994) cm. Opha Struct BioL 4:823-832; and Ofosn, FA e. al (1998) Biochsm. ,. 336 283- 
285). In add*™ to stimulating platelet activation through cleavage of the PAR-1 receptor 
thronrbm also induces phuele, aggtegarion following cleavage of glycoprotein V, also on tite 
surface of ptolete. Glycoprotein V appears to be tire nntjor thrombin substrate on intent 
Platefcti, Platelets deficient for g Iy coprotem V are hypersensitive to thrombin, which is still 
tenured to cleave PAR-1. TOfc platefc, aggregation is required for normal hemostasia in 
mammals, excessive phneto agnation can resuh in arterial mrombosis, atherosclerotic arteries 

96:13336-13341 and references witiun). 
hydros of ATPfortiaair^tivny. These proteases cnnteinproteo^ cnro domains and 

bmding motif (PROSITE PDOC00803). Members of tins family include the euhuyotic 
^otenMntatiixprotea.^.CIpprotease^naeprotoasomo. Op protease was originally 
fcund m plan, cblorophtste but is believed ro be widesproad in bompmxa^tic and eubuyotic 
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cells. The gene for early-onset torsion dystonia encodes a protein related to Clp protease 
(Ozelhxs, L.J. et al (1998) Adv. Neurol. 78:93-105). 

The proteasome is an intracellular protease complex found in some bacteria and in all 
eukaryotic cells, and plays an important role in cellular physiology. The proteasome is a large 

5 (-2000 kDa) multisubunit complex composed of a central catalytic core containing a variety of 
proteases arranged in four seven-membered rings with the active sites facing inwards into the 
central cavity, and terminal ATPase subunits covering the outer port of the cavity and regulating 
substrate entry (for review, see Schmidt, M. et al. (1999) Curr. Opin. Chem. Biol. 3:584-591). 
Proteasomes are associated with the ubiquitin conjugation system (UCS), a major pathway for the 

10 degradation of cellular proteins of all types, including proteins that function to activate or repress 
cellular processes such as transcription and cell cycle progression (Ciechanover, A. (1994) Cell 
79:13-21). In the UCS pathway, proteins targeted for degradation are conjugated to ubiquitin, a 
small heat stable protein. The nbiquitinated protein is then recognized and degraded by the 
proteasome. The resultant ubiquitin-peptide complex is hydrolyzed by a ubiquitin carboxyl 

15 terminal hydrolase, and free ubiquitin is released for reutilization by the UCS . Ubiquitin- 
proteasome systems are implicated in the degradation of mitotic cyclic kinases, oncoproteins, 
tumor suppressor genes (p53), cell surface receptors associated with signal transduction, 
transcriptional regulators, and mutated or damaged proteins (Ciechanover, supra). This pathway 
has been implicated in a number of diseases, including cystic fibrosis, Angelman's syndrome, and 

20 Liddle syndrome (reviewed in Schwartz, A.L. and A. Ciechanover (1999) Annu. Rev. Med. 
50:57-74). A murine proto-oncogene, Unp, encodes a nuclear ubiquitin protease whose 
overexpression leads to oncogenic transformation of NIH3T3 cells. The human homolog of this 
gene is consistently elevated in small cell tumors and adenocarcinomas of the lung (Gray, D.A. 
(1995) Oncogene 10:2179-2183). Ubiquitin carboxyl terminal hydrolase is involved in the 

25 differentiation of a lymphoblastic leukemia cell line to a non-dividing mature state (Maki, A. et 
al. (1996) Differentiation 60:59-66). In neurons, ubiquitin carboxyl terminal hydrolase (PGP 9.5) 
expression is strong in the abnormal structures that occur in human neurodegenerative diseases 
(Lowe, J. et al. (1990) J. Pathol. 161:153-160). The proteasome is a large (-2000 kDa) 
multisubunit complex composed of a central catalytic core containing a variety of proteases 

30 arranged in four seven-membered rings with the active sites facing inwards into the central 
cavity, and terminal ATPase subunits covering the outer port of the cavity and regulating 
substrate entry (for review, see Schmidt, M. et al. (1999) Curr. Op. Chem. BioL 3:584-591). 
Cysteine Proteases 

Cysteine proteases (CPs) are involved in diverse cellular processes ranging from the 



WO 03/025131 



PCT/US02/29221 



c*alys K proceeds via a thioeaer immediate ^ „ Mtated „ y ^ ^ ^ 
aspa^msidues. A^M—ta^j^..^..^^ 

pepndes as well as propeptides. Mo StlM nbe Is bca I aco DS erved mMi , tatiKpK)peptideti]at 
" ~ SignitaCe ^ KM " * (1 " 3) ^ N * Acad, Set USA 903063- 
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been proposed to contribute to brain damage resulting from head injury (McCracken, E. et al. 
(1999) J. Neurotrauma 16:749-761). Calpain-3 is predominantly expressed in skeletal muscle, 
and is responsible for limb-girdle muscular dystrophy type 2A (Minami, N. et al. (1999) J. 
Neurol Sci. 171:31-37). 

5 Another family of thiol proteases is the caspases, which are involved in the initiation and 

execution phases of apoptosis. A pro-apoptotic signal can activate initiator caspases that trigger a 
proteolytic caspase cascade, leading to the hydrolysis of target proteins and the classic apoptotic 
death of the cell Two active site residues , a cysteine and a histidine, have been implicated in the 
catalytic mechanism. Caspases are among the most specific endopeptidases, cleaving after 

10 aspartate residues. Caspases are synthesized as inactive zymogens consisting of one large (p20) 
and one small (plO) subunit separated by a small spacer region, and a variable N-teiminal 
prodomain. This prodomain interacts with cofactors that can positively or negatively affect 
apoptosis. An activating signal causes autoproteolytic cleavage of a specific aspartate residue 
(D297 in the caspase-1 numbering convention) and removal of the spacer and prodomain, leaving 

15 a pl0/p20 heterodimer. Two of these heterodimers interact via their small subunits to form the 
catalytically active tetramer. The long prodomains of some caspase family members have been 
shown to promote dimerization and auto-processing of procaspases. Some caspases contain a 
"death effector domain" in their prodomain by which they can be recruited into self-activating 
complexes with other caspases and FADD protein associated death receptors or the TNF receptor 

20 complex. In addition, two dimers from different caspase family members can associate, changing 
the substrate specificity of the resultant tetramer. Endogenous caspase inhibitors (inhibitor of 
apoptosis proteins, or IAPs) also exist All these interactions have clear effects on the control of 
apoptosis (reviewed in Chan and Mattson, supra; Salveson, G.S. and V.M. Dixit (1999) Proc. 
Natl. Acad. Sci. USA 96:10964-10967). 

25 Caspases have been implicated in a number of diseases. Mice lacking some caspases 

have severe nervous system defects due to failed apoptosis in the neuroepithelium and suffer 
early lethality. Others show severe defects in the inflammatory response, as caspases are 
responsible for processing IL-lb and possibly other inflammatory cytokines (Chan and Mattson, 
supra). Cowpox virus and baculoviruses target caspases to avoid the death of their host cell and 

30 promote successful infection. In addition, increases in inappropriate apoptosis have been 

reported in AIDS , neurodegenerative diseases and ischemic injury, while a decrease in cell death 
is associated with cancer (Salveson and Dixit, supra; Thompson, C.B. (1995) Science 267:1456- 
1462). 

Aspartvl proteases 
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Aspartyl proteases (APs) include the lysosomal proteases cathepsins D and E as well as 
chymosin, renin, and the gastric pepsins. Most retroviruses encode an AP, usually as part of the 
pel polyprotein. APs, also called acid proteases, are monomeric enzymes consisting of two 
domatns, each domain containing one half of the active site with its own catalytic aspartic acid 
5 residue. APs are most active in the rangeofpH 2-3, at which one of the aspartate residues is 

are hkely to be synthesized wifc signal peptides and propeptides. Most family members have 
three disulfide ^^^i^I^l^ttoJte^^fc^^^ 
loop preceding the second aspartate, and the third and largest loop occurring toward the C 
10 termmus. Retropepsms, on the other hand, are analogous to a single domain of pepsin, and 
become active as homodimers with each retropepsin monomer contributing onehalf of the active 
site. Retropepsins are required for processing the viral polyproteins. 

APs have roles in various tissues, and some have been associated with disease Renin 
mediates the first step in processing the hormone angiotensin^ wMc h is responsible for regulating 

Biol. 71:475-503). Abnormal regulation and expression of cathepsins are evident in various 
tnflammatory disease states. Expression of cathepsin D is elevated iu synovial tissues from 
pataentswimrheumatoidarthritisandosteoaru^ The increased expression and differential 
regulation of the cathepsins are linked to the metastatic potential of a variety of cancers 
(Chambers, A.F. et al. (1993) Crit. Rev. Oncol. 4:95-1 14). 
MetaHoprotftasPL-! 

Metalloproteases require ametal ion for activity, usually manganese or zinc. Examples 
ofn^anesemetalo^ymesmcludea^opepridasePandhu^ 

AmmopeptidasePcandegr^ ' 
responses. AnnnopeptidasePl^beenimphcatedmcomnary 

Adrmmstratton of aminopeptidase P inhibitors has been shown to have a cardioprotective effect 
mrats (Ersahin, C. et al (1999) J. Cardiovasc. Pharmacol. 34:604-611). 

Most zinc-dependent metaJloproteases share a common sequence in the zinc-binding 
domam The active site is made up of two histidines which act as zinc ligands and a catalytic 
^cacidC-tern^tothefixstlnstidine. Proteins containing this signature sequent axe 
known as the metzincins and include aminopeptidase N, angiotensin-converting enzyme 
nemolysi^mematrixmetaUoproteasesa^ An alternate sequence is 

foundmmezmccarboxv.epridases^wHchaJlthre^ 
glutamic acid - are involved in zinc binding. 
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A number of the neutral metalloendopeptidases, including angiotensin converting enzyme 
and the aminopeptidases, are involved in the metabolism of peptide hormones. High 
aminopeptidase B activity, for example, is found in the adrenal glands and neurohypophyses of 
hypertensive rats (Prieto, I. et aL (1998) Horm. Metab. Res. 30:246-248). Oligopeptidase 

5 M/neurolysin can hydrolyze bradykinin as well as neurotensin (Serizawa, A. et al. (1995) J. Biol. 
Chem 270:2092-2098). Neurotensin is a vasoactive peptide that can act as a neurotransmitter in 
the brain, where it has been implicated in limiting food intake (Tritos, N A. et al (1999) 
Neuropeptides 33:339-349). 

The matrix metalloproteases (MMPs) are a family of at least 23 enzymes that can degrade 

10 components of the extracellular matrix (ECM). They are Zn 2+ endopeptidases with an N-terminal 
catalytic domain. Nearly all members of the family have a hinge peptide and a C-tenninal 
domain which can bind to substrate molecules in the ECM or to inhibitors produced by the tissue 
(TMPs, for tissue inhibitor of metalloprotease; Campbell, I.L. and A. Pagenstecher (1999) 
Trends Neurosci. 22:285-287). The presence of fibronectin-like repeats, transmembrane 

15 domains, or C-tenninal hemopexinase-like domains can be used to separate MMPs into 

collagenase, gelatinase, stromelysin and membrane-type MMP subfamilies. In the inactive form, 
the Zn 2+ ion in the active site interacts with a cysteine in the pro-sequence. Activating factors 
disrupt the Zn 2+ -cysteine interaction, or "cysteine switch," exposing the active site. This partially 
activates the enzyme, which then cleaves off its propeptide and becomes fully active. MMPs are 

20 often activated by the serine proteases plasmin and furin. MMPs are often regulated by 

stoichiometric, noncovalent interactions with inhibitors; the balance of protease to inhibitor, then, 
is very important in tissue homeostasis (reviewed in Yong, V.W. et al. (1998) Trends Neurosci. 
21:75-80). 

MMPs are implicated in a number of diseases including osteoarthritis (Mitchell, P. et al. 

25 (1996) J. Clin Invest. 97:761-768), atherosclerotic plaque rupture (Sukhova, G.K. et al. (1999) 
Circulation 99:2503-2509), aortic aneurysm (Schneiderman, J. et al. (1998) Am. J. Path. 152:703- 
710), non-healing wounds (Saarialho-Kere, U.K. et al. (1994) J. Clin. Invest. 94:79-88), bone 
resorption (Blavier, L. and J.M. Delaisse (1995) J. Cell Sci. 108:3649-3659), age-related macular 
degeneration (Steen, B. et al. (1998) Invest Ophthalmol Vis. Sci. 39:2194-2200), emphysema 

30 (Finlay, G A. et al. (1997) Thorax 52:502-506), myocardial infarction (Rohde, L.E. et al. (1999) 
Circulation 99:3063-3070) and dilated cardiomyopathy (Thomas, C.V. et al. (1998) Circulation 
97:1708-1715). MMP inhibitors prevent metastasis of mammary carcinoma and experimental 
tumors in rat, and Lewis lung carcinoma, hemangioma, and human ovarian carcinoma xenografts 
in mice (Eccles, S.A. et aL (1996) Cancer Res. 56:2815-2822; Anderson et al. (1996) Cancer Res. 
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Sci. 1 12:3603-3617). The Kuzbanian protein cleaves a substrate in the NOTCH pathway 
(possibly NOTCH itself), activating the program for lateral inhibition in Drosophila neural 
development Two ADAMs, TACE (ADAM 17) and ADAM 10, are proposed to have analogous 
roles in the processing of amyloid precursor protein in the brain (Schlondorff and Blobel, supra). 

5 TACE has also been identified as the TNF activating enzyme (Black, R.A. et aL (1997) Nature 
385:729-733). TNE is a pleiotropic cytokine that is important in mobilizing host defenses in 
response to infection or trauma, but can cause severe damage in excess and is often overproduced 
in autoimmune disease. TACE cleaves membrane-bound pro-TNF to release a soluble form. 
Other ADAMs may be involved in a similar type of processing of other membrane-bound 

10 molecules. ' 

Proteins of the AD AMTS sub-family have all of the features of ADAM family 
metalloproteases and contain an additional thrombospondin domain (TS). The prototypic 
AD AMTS was identified in mouse, and found to be expressed in heart and kidney and 
upregulated by proinflammatory stimuli (Kuno, K. et aL (1997) J. Biol. Chem. 272:556-562). To 

15 date eleven members are recognized by the Human Genome Organization (HUGO ; 

http://www.gene.ucl.ac.uk/users^ Members of this family have 

the ability to degrade aggrecan, a high molecular weight proteoglycan which provides cartilage 
with important mechanical properties including compressibility, and which is lost during the 
development of arthritis. Enzymes which degrade aggrecan are thus considered attractive targets 

20 to prevent and slow the degradation of articular cartilage (See, e.g., Tortorella, M.D. (1999) 

Science 284:1664-1666; Abbaszade, I. (1999) J. Biol. Chem. 274:23443-23450). Other members 
are reported to have antiangiogenic potential (Kuno et aL, supra) and/or procollagen processing 
(Colige, A. et aL (1997) Proc. Natl. Acad. Sci USA 94:2374-2379). 
Protease inhibitors 

25 Protease inhibitors and other regulators of protease activity control the activity and effects 

of proteases . Protease inhibitors have been shown to control pathogenesis in animal models of 
proteolytic disorders (Murphy, G. (1991) Agents Actions Suppl. 35:69-76). Inpatients with HIV 
disease protease inhibitors have been shown to be effective in preventing disease progression and 
reducing mortality (Barry, M. et al. (1997) Clin. Pharmacokinet 32:194-209). Low levels of the 

30 cystatins, low molecular weight inhibitors of the cysteine proteases, correlate with malignant 
progression of tumors (Calkins, C. et al. (1995) Biol. Biochem. Hoppe Seyler 376:71-80). The 
cystatin superfamily of protease inhibitors is characterized by a particular pattern of linearly 
arranged and tandemly repeated disulfide loops (Kellermann, J. et al. (1989) J. Biol. Chem. 
264:14121-14128). An example of a representative of a structural prototype of a novel family 
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among the cystatin superfamily is human alpha 2-HS glycoprotein (AHSG), a plasma protein " 
synthesizedmhverandselectivefyconcentratedmbonem^ 

issues (Triffitt, IT. (1976) Calcif. Tissue Res. 22:27-33), which is also classified as belonging to 
thefetumfamily. Fetuins are characterized by the presence of 2 N-terminally located cystatin- 
5 likerepeats and a unique C-terminal domain which is not present in other proteins of the cystatin 
superfamDy (PROSITE PDOC00966). AHSG has been reported to be involved in bone 

formattonandresorptionas wen as immune responses (Yang, F. etal. (1992) 1130:149-156- Lee 
C.C.etal(1987)PNASUSA84:^^ ' 

Biochem. 63:1383-1391). Additionally, AHSG has been implicated in infertility associated with 
3 e » d °-*osis(Ma^ 

Autoimmunity 29:121-127) and inhibition of osteogenesis (Binkert, C. et al, (1999) J Biol Chen, 
274:28514-28520). Decreased serumlevels of AHSG have been detected in patients with acute 
leukennas, chronic granulocyte and myelomonocyte leukemias, lymphomas, myelofibrosis 
mutople myeloma, metastatizing solid tumors, systemic lupus erymematosus, rheumatoid ' 
axthnns, acute alcoholic hepatitis, fatty liver, chronic active hepatitis, liver cirrhosis, acute and 
chrome pancreatitis, and Crohn's disease (Kalabay, L. et al. (1992) Orv. HetiL 1331553-1554- 
1559-1560). 

Serpins are inhibitors of mammaJian plasma serine proteases. Manyserpins serve to 
regulate the blood clotting cascade and/or the complement cascade in mammals. Sp32 is a 
posttrve regulator of me mammalian acrosomal protease, acrosin, thatbinds the proenzyme 
proacrosm, and thereby aides in packaging the enzyme into the acrosomal matrix (Baba T et aL 
(1994) J. Biol. Chem 269:10133-10140). TheKunitz family of serine protease inhibitors are 

regularly spaced over approximately 50 amino acid residues and form three intrachain Sulfide 
bonds. Members of this family include ^oth^ ^ factor pa ^y Mitot (TFPU ^ 
TFPI-2), mter-a-trypsin inhibitor, and bikunin (Maxtor, C.W. et al. (1997) J Biol Chem 
272,2202-12208, Members of this family are potent inhtoitors (in *e nanomolar range) against 
senne proteases such as kallikrein and p]asm in. Aprotminhas clinical utility in reduction of 
penoperativebtoodtoss. ITI has been found to inactivate human trypsin, chymotrypsin, 
neutroplnl elastase and cafhepsin G (Morii, M et al. (1985) Biol. Chem Hoppe Seyler 36619- 
21); and. suspected of playmg a key role in the biology of me extracellular matrix and in the 
pathophysiology of chronic bronchopulmonary diseases or lung cancer progression (Cuveher A 
et al (2000) Rev. Mai. Respir. 17:437-446). 



13 



WO 03/025131 



PCT/US02/29221 



A major portion of all proteins synthesized in eukaryotic cells are synthesized on the 
cytosolic surface of the endoplasmic reticulum (ER). Before these immature proteins are 
distributed to other organelles in the cell or are secreted, they must be transported into the interior 
lumen of the ER where post-translational modifications are performed. These modifications 
5 include protein folding and the formation of disulfide bonds, and N-linked glycosylations. 
Protein Isomerases 

Protein folding in the ER is aided by two principal types of protein isomerases, protein 
disulfide isomerase (PDI), and peptidyl-prolyl isomerase (PPI). PDI catalyzes the oxidation of 
free sulfhydryl groups in cysteine residues to form intramolecular disulfide bonds in proteins. 

10 PPI, an enzyme that catalyzes the isomerization of certain proline imidic bonds in oligopeptides 
and proteins, is considered to govern one of the rate limiting steps in the folding of many proteins 
to their final functional conformatioiL The cyclophilins represent a major class of PPI that was 
originally identified as the major receptor for the immunosuppressive drug cyclosporin A 
(Handschumacher, R.E. et al. (1984) Science 226: 544-547). 

15 Protein Glycosylation 

The glycosylation of most soluble secreted and membrane-bound proteins by 
oligosaccharides linked to asparagine residues in proteins is also performed in the ER. This 
reaction is catalyzed by a membrane-bound enzyme, oligosaccharyl transferase. Although the 
exact purpose of this "N-linked" glycosylation is unknown, the presence of oligosaccharides 

20 tends to make a glycoprotein resistant to protease digestion. In addition, oligosaccharides 
attached to cell-surface proteins called selectins are known to function in cell-cell adhesion 
processes (Alberts, B. et al. (1994) Molecular Biology of the Cell Garland Publishing Co., New 
York, NY, p. 608). "O-linked" glycosylation of proteins also occurs in the ER by the addition of 
N-acetylgalactosamine to the hydroxyl group of a serine or threonine residue followed by the 

25 sequential addition of other sugar residues to the first. This process is catalyzed by a series of 
glycosyltransferases, each specific for a particular donor sugar nucleotide and acceptor molecule 
(Lodish, H. et al. (1995) Molecular Cell Biology , W. H. Freeman and Co., New York, NY, pp. 
700-708). For example, one of the glycosyltransferases in the dolichol pathway, dolichol 
phosphate mannose synthase,, is required in N:-glycosylation, O-mannosylation, and 

30 glycosylphosphatidylinositol membrane anchoring of protein (Tomita, S. et al. (1998) J. Biol. 
Chem. 9249-9254). Thus, in many cases, both N- and O-linked oligosaccharides appear to be 
required for the secretion of proteins or the movement of plasma membrane glycoproteins to the 
cell surface. 
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An additional glycosylate mechanism operates in the ER specifically to target 
lysosomale^estolysosomesandprevent^irsecredon Lysosomal enzymes in the ER 

Phosphorylatedononeortwomannoseresidues. The phosphorylation of mannose residues 

hy N-acetylglucosamine phosphotransferase, and the second the removal of the N- 
acetylglucosamme group by phosphodiesterase. The phosphorylated mannose restdue then 
targets the lysosomal to a mannose 6-phosphate receptor wMch transports it to a 

lysosome vesicle (Lodish et al. supra, pp. 708-71 1). 
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skin, bone, tendon, cartilage, blood vessels and teeth. Members of the collagen family can be 
distinguished from one another by the degree of cross-linking between collagen fibers and by the 
number of carbohydrate units (e.g., galactose or ghicosylgalactose) attached to the collagen 
fibers. Hydroxylated lysine residues (hydroxyzine) are essential for stability of cross- li n k i ng 

5 and as attachment points for carbohydrate units. 

The enzyme lysyl hydroxylase catalyzes the hydroxylation of lysine residues to form 
hydroxylysine. Lysyl hydroxylase targets the lysine residue of the sequence, X-lys-gly (lys = 
lysine, gly = glycine, and X = any amino acid residue). Three isoforms of lysyl hydroxylase have 
been characterized, termed LH1 (or PLOD; procollagen-lysine, 2-oxoglutarate 5-dioxygenase), 

10 LH2 (or PLOD2), and LIB. The three enzymes share 60% sequence identity overall, with even 
higher similarity in the C-terminal region. In addition, there are regions in the middle of the 
molecule that have an identity of more than 80% (Valtavaara, M. et al. (1998) J. Biol. Chem. 
273:12881-12886). 

Diminished lysyl hydroxylase activity is involved in certain connective tissue disorders. 
15 In particular mutations, including a truncation and duplications within the coding region of the 
gene for PLOD, have been described in patients with type VI Ehlers-Danos syndrome (Hyland, J. 
et al. (1992) Nature Genet. 2:228-31; Hautala, T. et al. (1993) Genomics 15:399-404). 
Expression profiling 

Microarrays are analytical tools used in bioanalysis. A microarray has a plurality of 
20 molecules spatially distributed over, and stably associated with, the surface of a solid support. 
Microarrays of polypeptides, polynucleotides, and/or antibodies have been developed and find 
use in a variety of applications, such as gene sequencing, monitoring gene expression, gene 
mapping, bacterial identification, drug discovery, and combinatorial chemistry. 

One area in particular in which microarrays find use is in gene expression analysis. Array 
25 technology can provide a simple way to explore the expression of a single polymorphic gene or 
the expression profile of a large number of related or unrelated genes. When the expression of a 
single gene is examined, arrays are employed to detect the expression of a specific gene or its 
variants. When an expression profile is examined, arrays provide a platform for identifying genes 
that are tissue specific, are affected by a substance being tested in a toxicology assay, are part of a 
30 signaling cascade, carry out housekeeping functions, or are specifically related to a particular 
Alzheimer's Disease 

The potential application of gene expression profiling is also relevant to improving diagnosis, 
prognosis, and treatment of diseases such as Alzheimer's disease. For example, both the levels and 
sequences expressed in tissues from subjects with Alzheimer's disease may be compared with the 
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levels and sequences expressed in normal brain tissue. Alzheimer's disease is a progressive 
neurodegenerative disorder mat is characterized by the formation of senile plaques and neurofibrillary 
tangles containing amyloid beta peptide. These plaques are found in limbic and association cortices 
ofthebrain. The hippocampus is part of the limbic system and plays an important role in learning 
5 andmemory. In subjects with Alzheimer's disease, accumulating plaques damage the neuronal 
architecture in limbic areas and eventually cripple the memory process. 
Steroid Hormones 

The potential application of gene expression profiling is relevant to measuring the toxic 
response to potential therapeutic compounds and of the metabolic response to therapeutic agents For 
10 instance, diseases treated with steroids and disorders caused by the metabolic response to treatment 
with steroids include adenomatosis, cholestasis, cirrhosis, hemangioma, Henoch-Schonlein purpura 
hepatitis, hepatocellular and metastatic carcinomas, idiopathic thrombocytopenic purpura, porphyria 
sarcoidosis, and Wilson disease. It is desirable to measure file toxic response to potential therapeutic' 
compounds and of the metabolic response to therapeutic agents. 
15 Steroids are a class of lipid-soluble molecules, including cholesterol, bile acids, vitamin D 

and hormones, that share a common four-ring structure based on cyclopentanoperhydrophenanthrene 
and that carrry out a wide variety of functions. Steroid hormones, produced by the adrenal cortex 
ovanes, and testes, include glucocorticoids, mineralocorticoids, androgens, and estrogens. Steroid 
hormones are widely used for fertility control and in anti-inflammatory treatments for physical 
20 mjuries and diseases such as arthritis, asthrna, and auto-immune disorders. Progesterone, a naturally 
occurring progestin, is primarily used to treat amenorrhea, abnormal uterine bleeding, or as a 
contraceptive. Medroxyprogesterone (MAH), also known as 6a-methyl-17-hydroxyprogesterone is a 
synthetic progestin with a pharmacological activity about 15 times greater than progesterone MAH 
is used for the treatment of renal and endometrial carcinomas, amenorrhea, abnormal uterine 
25 bleeding, and endometriosis associated wimhon^nal imbalance. MAH has a stimulatory effect on 
resprratory centers and has been used in cases of low blood oxygenation caused by sleep apnea 
chronic obstructive pulmonary disease, or hypercapnia. Beclomethasone is a synthetic glucocorticoid 
that rs used to treat steroid-dependent asthma, to relieve symptoms associated with allergic or 
nonallergic (vasomotor) rhinitis, or to prevent recurrent nasal polyps following surgical removal 
30 Budesomde is a corticosteroid used to control symptoms associated with allergic rhinitis or asthma 
Dexameflxasone is a synthetic glucocorticoid used in anti-inflammatory or immunosuppressive 
compositions. Prednisone is metabolic in me liver to its active form, prednisolone, a glucocorticoid 
with anti-inflammatory properties. Betamethasone is a synthetic glucocorticoid with anti- 
inflammatory and immunosuppressive activity and is used to treat psoriasis and fungal infections 
35 such as athlete's foot and ringworm By comparing both the levels and sequences expressed in tissues 
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from subjects exposed to or treated with steroid compounds with the levels and sequences expressed 
in normal untreated tissue it is possible to determine tissue responses to steroids. 
Breast Cancer 

Array technology can provide a simple way to explore the expression of a single polymorphic 

5 gene or the expression profile of a large number of related or unrelated genes. When the expression 
of a single gene is examined, arrays are employed to detect the expression of a specific gene or its 
variants. When an expression profile is examined, arrays provide a platform for identifying genes 
that are tissue specific, are affected by a substance being tested in a toxicology assay, are part of a 
signaling cascade, carry out housekeeping functions, or are specifically related to a particular genetic 

10 predisposition, condition, disease, or disorder. 

The potential application of gene expression profiling is particularly relevant to improving 
diagnosis, prognosis, and treatment of cancers, such as breast cancer, colon cancer, lung cancer, 
ovarian cancer and prostate cancer. Breast cancer is the most frequently diagnosed type of cancer in 
American women and the second most frequent cause of cancer death. The lifetime risk of an 

15 American woman developing breast cancer is 1 in 8, and one-third of women diagnosed with breast 
cancer die of the disease. A number of risk factors have been identified, including hormonal and 
genetic factors. One genetic defect associated with breast cancer results in a loss of heterozygosity 
(LOH) at multiple loci such as p53, Rb, BRCA1, and BRCA2. Another genetic defect is gene 
amplification involving genes such as c-myc and c-erbB2 (Her2-neu gene). Steroid and growth factor 

20 pathways are also altered in breast cancer, notably the estrogen, progesterone, and epidermal growth 
factor (EGF) pathways. Breast cancer evolves through a multi-step process whereby premalignant 
mammary epithelial cells undergo a relatively defined sequence of events leading to tumor formation. 
An early event in tumor development is ductal hyperplasia. Cells undergoing rapid neoplastic growth 
gradually progress to invasive carcinoma and become metastatic to the lung, bone, and potentially 

25 other organs. Variables that may influence the process of tumor progression and malignant 
transformation include genetic factors, environmental factors, growth factors, and hormones. 
Colon Cancer 

Colon cancer evolves through a multi-step process whereby pre-malignant colonocytes 
undergo a relatively defined sequence of events leading to tumor formation. While soft tissue 
30 sarcomas are relatively rare, more than 50% of new patients diagnosed with the disease will die from 
it The molecular pathways leading to the development of sarcomas are relatively unknown, due to 
the rarity of the disease and variation in pathology. Several factors participate in the process of tumor 
progression and malignant transformation including genetic factors, mutations, and selection. 

To understand the nature of gene alterations in colorectal cancer, a number of studies have 
35 focused on the inherited syndromes. Familial adenomatous polyposis (FAP), is caused by mutations 
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!0 Lung Cancer 

Lung cancer is the leading cause of cancer death in the United States, affecting more than 
100,000 menand 50,000 w 0J nen each year. Nearly 90% of thepatients diagnosed wim lung cancer 
are cxgarette smokers. Tobacco smolce contains thousands of noxious substances that induce 

epnbehum Innearly 80% of patients diagnosed with lung cancer, metastasis has already occurred 
Most commonly lung cancers metastasize to pleura, brain, bone, pericardium and Uve, The decision 
to treat with surgery, radiation therapy, or chemotherapy is n.de on the basis of tumor histology 
response to growth factors or hormones, and sensitivity to inhibitors or drugs. Withcurrent ' 
treatr^^mostpatientsdiewitooneyearofdiagnosis. Earlier diagnosis and a systematic 
approach to identification, staging, and treatment of lung cancer could positively affect patient 
outcome. 



Lung cancers progress through a series of morphologically distinct stages from hyperplasia 
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secretmgglands. Squamous cell carcinomas typically arise in proximal airways. The histogenesis of 
squamous cell carcinomas may be related to chronic inflammation and injury to the bronchial 
epthehum, leading to squamous metaplasia. The Small Cell Lung Carcinoma (SCLC) group 
30 accounts forabout20%oflung cancer cases. SCLCs typically arise in proxhnal airways and exhibit 
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this disease. Deletion of the short arm of chromosome 3 is found in over 90% of cases and represents 
one of the earliest genetic lesions leading to lung cancer. Deletions at chromosome arms 9p and 17p 
are also common. Other frequently observed genetic lesions include overexpression of telomerase, 
activation of oncogenes such as K-ras and c-myc, and inactivation of tumor suppressor genes such as 

5 RB, p53 and CDKN2. 

Genes differentially regulated in lung cancer have been identified by a variety of methods. 
Using mRNA differential display technology, Manda et aL (1999; Genomics 51 :5-14) identified five 
genes differentially expressed in lung cancer cell lines compared to normal bronchial epithelial cells. 
Among the known genes, pulmonary surfactant apoprotein A and alpha 2 macroglobulin were down 

10 regulated whereas nn323Hl was upregulated. Petersen etaL. (2000; Int J. Cancer, 86:512-517) used 
suppression subtractive hybridization to identify 552 clones differentially expressed in lung tumor 
derived cell lines, 205 of which represented known genes. Among the known genes, 
thrombospondin- 1 , fibronectin, intercellular adhesion molecule 1, and cytokeratins 6 and 18 were 
previously observed to be differentially expressed in lung cancers. Wang et al. (2000; Oncogene 

15 19:1519-1528) used a combination of microarray analysis and subtractive hybridization to identify 17 
genes differentially overexpresssed in squamous cell carcinoma compared with normal lung 
epithelium. Among the known genes they identified were keratin isoform 6, KOC, SPRC, IGFb2, 
connexin 26, plakofillin 1 and cytokeratin 13. 
Ovarian Cancer 

20 Ovarian cancer is the leading cause of death from a gynecologic cancer. The majority of 

ovarian cancers are derived from epithelial cells, and 70% of patients with epithelial ovarian cancers 
present with late-stage disease. As a result, the long-term survival rates for this disease is very low. 
Identification of early-stage markers for ovarian cancer would significantly increase the survival rate. 
Genetic variations involved in ovarian cancer development include mutation of p53 and microsateOite 

25 instability. Gene expression patterns likely vary when normal ovary is compared to ovarian tumors. 
Prostate Cancer 

Prostate cancer is a common malignancy in men over the age of 50, and the incidence 
increases with age. In the US, there are approximately 132,000 newly diagnosed cases of prostate 
cancer and more than 33,000 deaths from the disorder each year. Once cancer cells arise in the 
30 prostate, they are stimulated by testosterone to a more rapid growth. Thus, removal of the testes can 
indirectly reduce both rapid growth and metastasis of the cancer. Over 95 percent of prostatic cancers 
are adenocarcinomas which originate in the prostatic acini. The remaining 5 percent are divided 
between squamous cell and transitional cell carcinomas, both of which arise in the prostatic ducts or 
other parts of the prostate gland. 
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activated by treatment with both phorbol ester such as phorbol myristate acetate (PMA), and 
lipopolysaccharide (LPS). PMA is a broad activator of the protein kinase C-dependent pathways. 

Monocytes are involved in the initiation and maintenance of inflammatory immune 
responses. The outer membrane of gram-negative bacteria expresses lipopolysaccharide (LPS) 

5 complexes called endotoxins. Toxicity is associated with the lipid component (Lipid A) of LPS, and 
immunogenicity is associated with the polysaccharide components of LPS. LPS elicits a variety of 
inflammatory responses, and because it activates complement by the alternative (properdin) pathway, 
it is often part of the pathology of gram-negative bacterial infections. For the most part, endotoxins 
remain associated with the cell wall until the bacteria disintegrate. LPS released into the bloodstream 

10 by lysing gram-negative bacteria is first bound by certain plasma proteins identified as LPS-binding 
proteins. The LPS-binding protein complex interacts with CD14 receptors on monocytes, 
macrophages, B cells, and other types of receptors on endothelial cells. Activation of human B cells 
with LPS results in mitogenesis as well as immunoglobulin synthesis. In monocytes and 
macrophages three types of events are triggered during their interaction with LPS: 1) Production of 

15 cytokines, including IL-1, BL-6, IL-8, TNF-a , and platelet-activating factor, which stimulate 
production of prostaglandins and leukotrienes that mediate inflammation and septic shock; 2) 
Activation of the complement cascade; and 3) Activation of the coagulation cascade. 

There is a need in the art for new compositions, including nucleic acids and proteins, for 
the diagnosis, prevention, and treatment of gastrointestinal, cardiovascular, 

20 autoimmune/inflammatory, cell proliferative, developmental, epithelial, neurological, reproductive, 
endocrine, metabolic, pancreatic disorders, disorders associated with the adrenals, disorders 
associated with gonadal steroid hormones, cancers, and infections. % 

SUMMARY OF THE INVENTION 

25 Various embodiments of the invention provide purified polypeptides, protein modification 

and maintenance molecules, referred to collectively as TMMM' and individually as TMMM-1/ 
TMMM-2/ TMMM-3,' TMMM-4/ TMMM-5,' TMMM-6/ TMMM-7/ TMMM-8,' 'PMMM- 
9/ TMMM- 10/ TMMM-11/ TMMM-12,' TMMM-13/ TMMM-14/ TMMM-15/ TMMM- 
16/ TMMM-17/ TMMM-18/ TMMM-19/ TMMM-20/ TMMM-21/ TMMM-22/ TMMM- 

30 23/ TMMM-24/ TMMM-25/ TMMM-26/ TMMM-27/ TMMM-28/ TMMM-29,' TMMM- 
30/ and TMMM-3 1' and methods for using these proteins and their encoding polynucleotides for the 
detection, diagnosis, and treatment of diseases and medical conditions. Embodiments also provide 
methods for utilizing the purified protein modification and maintenance molecules and/or their 
encoding polynucleotides for facilitating the drug discovery process, including determination of 

35 efficacy, dosage, toxicity, and pharmacology. Related embodiments provide methods for utilizing the 
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purified protein modification and maintenance molecules and/or their encoding polynucleotides for 
investigating me pathogenesis of diseases and medical conditions. 

An embodiment provides an isolated polypeptide selected from me group consisting of a) a 
polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:l- 
31, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at 
least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO:l-31, c) a biologically active fragment of a polypeptide having an amino acid sequence selected 
fromthe group consisting of SEQ ID NO:l-31, and d) an immunogenic fragment of a polypeptide 
having an amino acid sequence selected fromthe group consisting of SEQ ID NOrl-31. Another 
embodiment provides an isolated polypeptide comprising an amino acid sequence of SEQ ID 
NO:l-31. 

Still another embodiment provides an isolated polynucleotide encoding a polypeptide 
selected from me group consisting of a) a polypeptide comprising an amino acid sequence selected 
from the group consisting of SEQ ID NO.1-31, b) a polypeptide comprising a naturally occurring 
15 ammo acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence 
selected from the group consisting of SEQ ID NO:l-31, c) a biologically active fragment of a 
polypeptide having an amino acid sequence selected fromthe group consisting of SEQ ID NO:l-31 
and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected fromthe' 
group consisting of SEQ ID NO:l-31. In another embodiment, the polynucleotide encodes a 
20 polypeptideselectedfrommegroupconsistingofSEQlDNO:l-31. In an alternative embodiment, 
the polynucleotide is selected from the group consisting of SEQ ID NO:32-62. 

Still another embodiment provides a recombinant polynucleotide comprising a promoter 
sequence operably linked to a polynucleotide encoding a polypeptide selected from the group 
insisting of a) a polypeptide comprising an amino acid sequence selected fromthe group consisting 
25 of SEQ ID NO:l-31, b) a polypeptide comprising a naturally occurring amino acid sequence at least 
90% rdentical or at least about 90% identical to an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-3 1, c) a biologically active fragment of a polypeptide having an amino 
add sequence selected fromthe group consisting of SEQ ID NO:l-31, and d) an immunogenic 
fragment of a polypeptide having an amino acid sequence selected fromthe group consisting of SEQ 
30 IDNO:l-31. Another embodiment provides a cell transformed with the recombinant polynucleotide 
Yet another embodiment provides a transgenic organism comprising the recombinant polynucleotide. 

Another embodiment provides a method for producing a polypeptide selected from the group 
consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 
of SEQ ID NO:l-31, b) a polypeptide comprising a naturally occurring amino acid sequence at leasl 
35 90% identical or at least about 90% identical to an amino acid sequence selected from the group 
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consisting of SEQ ID NO:l-31, c) a biologically active fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-31, and d) an immunogenic 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO:l-31. The method comprises a) culturing a cell under conditions suitable for expression of the 

5 polypeptide, wherein said cell is transformed with a recombinant polynucleotide comprising a 

promoter sequence operably linked to a polynucleotide encoding the polypeptide, and b) recovering 
the polypeptide so expressed. 

Yet another embodiment provides an isolated antibody which specifically binds to a 
polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 

10 sequence selected from the group consisting of SEQ ID NO: 1-31, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO:l-31, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-31, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 

15 selected from the group consisting of SEQ ID NO: 1-31. 

Still yet another embodiment provides an isolated polynucleotide selected from the group 
consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of SEQ ID NO:32-62, b) a polynucleotide comprising a naturally occurring polynucleotide 
sequence at least 90% identical or at least about 90% identical to a polynucleotide sequence selected 

20 from the group consisting of SEQ ID NO:32-62, c) a polynucleotide complementary to the 

polynucleotide of a), d) a polynucleotide complementary to the polynucleotide of b), and e) an RNA 
equivalent of a)-d). In other embodiments, the polynucleotide can comprise at least about 20, 30, 40, 
60, 80, or 100 contiguous nucleotides. 

Yet another embodiment provides a method for detecting a target polynucleotide in a sample, 

25 said target polynucleotide being selected from the group consisting of a) a polynucleotide comprising 
a polynucleotide sequence selected from the group consisting of SEQ ID NO:32-62, b) a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:32-62, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 

30 complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The method 
comprises a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in the sample, and which probe 
specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization 
complex is formed between said probe and said target polynucleotide or fragments thereof, and b) 

35 detecting the presence or absence of said hybridization complex. In a related embodiment, the 
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method can include detecting the amount of the hybridization complex. In still other embodiments 
the probe can comprise at least about 20, 30, 40, 60, 80, or 100 contiguous nucleotides. 

Still yet another embodiment provides a method for detecting a target polynucleotide in a 
sample, said target polynucleotide being selected from the group consisting of a) a polynucleotide 
5 compnsmg a polynucleotide sequence selected from the group consisting of SEQ ID N032-62 b)a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:32-62, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 
complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The method 
10 compnses a) amplifying said target polynucleotide or fragment defusing polymerase chain 
reacfon amplification, and b) detecting the presence or absence of said amplified target 
polynucleotide or fragment thereof. In a related embodiment, the method can include detecting the 
amount of the amplified target polynucleotide or fragment thereof. 

Another embodiment provides a composition comprising an effective amount of a 
15 polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
sequence selected fromthe group consisting of SEQ ID NO:l-31, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
ammo ac ld sequence selected from the group consisting of SEQ ID NO: 1 -31, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
20 roN 0^31>andd)animn™ 0 genicfra^ 

selected fromthe group consisting of SEQ ID NO.1-31, and a pharmaceutical* acceptable excipient 
In one embodiment, the composition can comprise an amino acid sequence selected from the group 
consisting of SEQ ID NO:l-31. Other enfcodiments provide a method of treating a disease or 
condition associated with decreased or abnormal expression of functional PMMM, comprising 
25 adnmBteni* to a patient^ 

Yet another embodiment provides a method for screening a compound for effectiveness as an 
agomst of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 
acrd sequence selected from the group consisting of SEQ ID NO:l-3 1, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
ammo acrd sequent selected fromthe group consisting of SEQ ID NO:l-31, c) a biologically active 
fragment of a polypeptide having an amino acid sequent selected fromthe group consisting of SEQ 
ID NO:l-31, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected fromthe group consisting of SEQ ID NO:l-31. The method comprises a) exposing a sample 
compnsmg the polypeptide to a compound, and b) detecting agonist activity in the sample. Another 
35 ^^P-idesacomposition^^ 
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phannaceutically acceptable excipient. Yet another embodiment provides a method of treating a 
disease or condition associated with decreased expression of functional PMMM, comprising 
administering to a patient in need of such treatment the composition. 

Still yet another embodiment provides a method for screening a compound for effectiveness 

5 as an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an 
amino acid sequence selected from the group consisting of SEQ ID NO:l-31, b) a polypeptide 
comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% 
identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-31, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected from the group 

10 consisting of SEQ ID NO:l-31, andd) an immunogenic fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-31. The method conqmses a) 
exposing a sample comprising the polypeptide to a compound, and b) detecting antagonist activity in 
the sample. Another embodiment provides a composition comprising an antagonist compound 
identified by the method and a pharmaceutical^ acceptable excipient Yet another embodiment 

15 provides a method of treating a disease or condition associated with overexpression of functional 
PMMM, comprising administering to a patient in need of such treatment the composition. 

Another embodiment provides a method of screening for a compound that specifically binds 
to a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-31, b) a polypeptide comprising a 

20 naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO:l-31, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-31 , and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-31. The method comprises a) combining the 

25 polypeptide with at least one test compound under suitable conditions, and b) detecting binding of the 
polypeptide to the test compound, thereby identifying a compound that specifically binds to the 
polypeptide. 

Yet another embodiment provides a method of screening for a compound that modulates the 
activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 

30 acid sequence selected from the group consisting of SEQ ID NO: 1 -3 1 , b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-31, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-31, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 

35 selected from the group consisting of SEQ ID NO:l-31. The method comprises a) combining the 
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polypeptide with at least one test compound under conditions permissive for me activity of the 
polypeptide, b) assessing me activity of me polypeptide in the presence of me test compound, and c) 
comparing the activity of me polypeptide in the presence of the test compound with the activity of the 
polypeptide in the absence of the test compound, wherein a change in me activity of the polypeptide 
in the presence of the test compound is indicative of a compound that modulates the activity of the 
polypeptide. 

Still yet another embodiment provides a method for screening a compound for effectiveness 
in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:32-62, the method 
comprising a) exposing a sample comprising the target polynucleotide to a compound, b) detecting 
altered expression of the target polynucleotide, and c) comparing the expression of the target 
polynucleotide in the presence of varying amounts of me compound and in me absence of the 
compound. 

Another embodiment provides a method for assessing toxicity of a test compound, said 
method comprising a) treating a biological sample containing nucleic acids with the test compound; 
b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 
contiguous nucleotides of a polynucleotide selected fromthe group consisting of i) a polynucleotide 
comprising a polynucleotide sequence selected fromthe group consisting of SEQ ID NO:32-62, ii) a . 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:32-62, hi) a polynucleotide having a sequence complementary to i), iv) a polynucleotide 
complementary to the polynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridization occurs 
under conditions whereby a specific hybridization complex is formed between said probe and a target 
polynucleotide in the biological sample, said target polynucleotide selected from the group consisting 
of i) a polynucleotide comprising a polynucleotide sequence selected from me group consisting of 
SEQ ID NO.-32-62, ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at 
least 90% identical or at least about 90% identical to a polynucleotide sequence selected from the 
group consisting of SEQ ID NO:32-62, iii) a polynucleotide complementary to the polynucleotide of 
i), iv) a polynucleotide complementary to me polynucleotide of ii), and v) an RNA equivalent of i> 
iv). Alternatively, me target polynucleotide can comprise a fragment of a polynucleotide selected 
from the group consisting of i)-v) above; c) quantifying the amount of hybridization complex; and d) 
comparing the amount of hybridization complex in the treated biological sample with the amount of 
hybridization complex in an untreated biological sample, wherein a difference in the amount of 
hybridization complex in the treated biological sample is indicative of toxicity of the test compound. 



27 



WO 03/025131 



PCT/US02/29221 



BRIEF DESCRIPTION OF THE TABLES 
Table 1 summarizes the nomenclature for full length polynucleotide and polypeptide 
embodiments of the invention. 

Table 2 shows the GenBank identification number and annotation of the nearest GenBank 
5 homolog, and the PROTEOME database identification numbers and annotations of PROTEOME 
database homologs, for polypeptide embodiments of the invention. The probability scores for the 
matches between each polypeptide and its homolog(s) are also shown. 

Table 3 shows structural features of polypeptide embodiments, including predicted motifs 
and domains, along with the methods, algorithms, and searchable databases used for analysis of the 
10 polypeptides. 

Table 4 lists the cDNA and/or genomic DNA fragments which were used to assemble 
polynucleotide embodiments, along with selected fragments of the polynucleotides. 

Table 5 shows representative cDNA libraries for polynucleotide embodiments. 

Table 6 provides an appendix which describes the tissues and vectors used for construction of 
15 the cDNA libraries shown in Table 5. 

Table 7 shows the tools, programs, and algorithms used to analyze polynucleotides and 
polypeptides, along with applicable descriptions, references, and threshold parameters. 

Table 8 shows single nucleotide polymorphisms found in polynucleotide sequences of the 
invention, along with allele frequencies in different human populations. 

20 

DESCRIPTION OF THE INVENTION 

Before the present proteins, nucleic acids, and methods are described, it is understood that 
embodiments of the invention are not limited to the particular machines, instruments, materials, and 
methods described, as these may vary. It is also to be understood that the terminology used herein is 
25 for the purpose of describing particular embodiments only, and is not intended to limit the scope of 
the invention. 

As used herein and in the appended claims, the singular forms "a," "an," and "the" include 
plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to "a 
host cell" includes a plurality of such host cells, and a reference to "an antibody" is a reference to one 
30 or more antibodies and equivalents thereof known to those skilled in the art, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. 
Although any machines, materials, and methods similar or equivalent to those described herein can be 
used to practice or test the present invention, the preferred machines, materials and methods are now 
35 described. All publications mentioned herein are cited for the purpose of describing and disclosing 
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The terms "amino acid" and "amino acid sequence" can refer to an oligopeptide, a peptide, a 
polypeptide, or a protein sequence, or a fragment of any of these, and to naturally occurring or 
synthetic molecules. Where "amino acid sequence" is recited to refer to a sequence of a naturally 
occurring protein molecule, "amino acid sequence" and like terms are not meant to limit the amino 
5 acid sequence to the complete native amino acid sequence associated with the recited protein 
molecule. 

"Amplification" relates to the production of additional copies of a nucleic acid. 
Amplification may be carried out using polymerase chain reaction (PCR) technologies or other 
nucleic acid amplification technologies well known in the art. 

10 The term "antagonist" refers to a molecule which inhibits or attenuates the biological activity 

of PMMM. Antagonists may include proteins such as antibodies, anticalins, nucleic acids, 
carbohydrates, small molecules, or any other compound or composition which modulates the activity 
of PMMM either by directly interacting with PMMM or by acting on coiqp onents of the biological 
pathway in which PMMM participates. 

15 The term "antibody" refers to intact immunoglobulin molecules as well as to fragments 

thereof, such as Fab, F(ab , ) 2 , and Fv fragments, which are capable of binding an epitopic determinant. 
Antibodies that bind PMMM polypeptides can be prepared using intact polypeptides or using 
fragments containing small peptides of interest as the immunizing antigen. The polypeptide or 
oligopeptide used to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the 

20 translation of RNA, or synthesized chemically, and can be conjugated to a carrier protein if desired. 
Commonly used carriers that are chemically coupled to peptides include bovine serum albumin, 
thyroglobulin, and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize 
the animal. 

The term "antigenic determinant" refers to that region of a molecule (i.e., an epitope) that 
25 makes contact with a particular antibody. When a protein or a fragment of a protein is used to 
immunize a host animal, numerous regions of the protein may induce the production of antibodies 
which bind specifically to antigenic determinants (particular regions or three-dimensional structures 
on the protein). An antigenic determinant may compete with the intact antigen (i.e., the immunogen 
used to elicit the immune response) for binding to an antibody. 
30 The term "aptamer" refers to a nucleic acid or oligonucleotide molecule that binds to a 

specific molecular target. Aptamers are derived from an in vitro evolutionary process (e.g., SELEX 
(Systematic Evolution of Ligands by Exponential Enrichment), described in U.S. Patent No. 
5,270,163), which selects for target-specific aptamer sequences fromlarge combinatorial libraries. 
Aptamer compositions may be double-stranded or single-stranded, and may include 
35 deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. 
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polynucleotides encoding PMMM or fragments of PMMM may be employed as hybridization probes. 
The probes may be stored in freeze-dried form and may be associated with a stabilizing agent such as 
a carbohydrate. In hybridizations, the probe may be deployed in an aqueous solution containing salts 
(e.g., NaCl), detergents (e.g., sodium dodecyl sulfate; SDS), and other components (e.g., Denhardt's 
5 solution, dry milk, salmon sperm DNA, etc.). 

"Consensus sequence" refers to a nucleic acid sequence which has been subjected to repeated 
DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied 
Biosystems, Foster City CA) in the 5' and/or the 3* direction, and resequenced, or which has been 
assembled from one or more overlapping cDNA, EST, or genomic DNA fragments using a computer 
10 program for fragment assembly, such as the GELVIEW fragment assembly system (Accelrys, 

Burlington MA) or Phrap (University of Washington, Seattle WA). Some sequences have been both 
extended and assembled to produce the consensus sequence. 

"Conservative amino acid substitutions" are those substitutions that are predicted to least 
interfere with the properties of the original protein, i.e., the structure and especially the function of 
15 the protein is conserved and not significantly changed by such substitutions. The table below shows 
amino acids which may be substituted for an original amino acid in a protein and which are regarded 
as conservative amino acid substitutions. 

Original Residue Conservative Substitution 





Ala 


Gly, Ser 


20 


Arg 


His, Lys 




Asn 


Asp, Gin, His 




Asp 


Asn, Glu 




Cys 


Ala, Ser 




Gin 


Asn, Glu, His 


25 


Glu 


Asp, Gin, His 




Gly 


Ala 




His 


Asn, Arg, Gin, Glu 




lie 


Leu, Val 




Leu 


ne, Val 


30 


Lys 


Arg, Gin, Glu 




Met 


Leu, He 




Phe 


His, Met, Leu, Trp, Tyr 




Ser 


Cys, Thr 




Thr 


Ser, Val 


35 


Trp 


Phe, Tyr 




Tyr 


His, Phe, Trp 




Val 


lie, Leu, Thr 



Conservative amino acid substitutions generally maintain (a) the structure of the polypeptide 
40 backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, 
(b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of 
the side chain. 



32 



10 



15 



WO 03/025131 

PCT/US02/29221 

A "deletion" refers to a change in the amino acid or nucleotide sequence tbat results in the 
absence of one or more amino acid residues or nucleotides. 

The term "derivative" refers to a chemically modified polynucleotide or polypeptide 
Chemical modifications of a polynucleotide can include, for example, replacement of hydrogenby an 
alkyl, acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which 
retams at least one biological or immunological function of ihe natural molecule. A derivative 
polypeptide is one modified by glycosylate pegylation, or any similar process that retains at least 
one biological or immunological function of the polypeptide from which it was derived. 

A "detectable label" refers to a reporter molecule or enzyme that is capable of generating a 
measurable signal and is covalently or noncovalently joined to a polynucleotide or polypeptide. 

"Differential expression" refers to increased or upregulated; or decreased, deregulated or 
absentgeneorproteinexpression, determined by comparing at least two different samples Such 
comparisons may be carried out between, for example, a treated and an untreated sample, or a 
diseased and a normal sample. 

"Exon shuffling" refers to the recombination of different coding regions (exons). Since an 
exon may represent a structural or functional domain of the encoded protein, new proteins may be 
assembled through the novel reassortment of stable substructures, thus allowing acceleration of the 
evolution of new protein functions. 

A "fragment" is a unique portion of PMMM or a polynucleotide encoding PMMM which can 
20 be ulentical in sequence to, but shorter in length than, the parent sequel. A fragment may comprise 
up to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For 
example, a fragment may comprise from about 5 to about 1000 contiguous nucleotides or amino acid 
residues. A fragment used as a probe, primer, antigen, therapeutic molecule, or for other purposes 
xuaybeatleastS, 10, 15, 16,20,25,30,40,50,60,75, 100, 150, 250 or at least 500 contiguous ' 
25 nucleotidesoraminoacidresiduesinlength. Fragment, may be preferentially selected from certain 
reg,ons of a molecule. For example, a polypeptide fragment may comprise a certain length of 
contiguous amino acids selected from the first 250 or 500 amino acids (or first 25% or 50%) of a 
polypeptide as shown in a certain denned sequence. Clearly these lengths are exemplary, and any 
length that is supported by the specification, including me Sequence listing, tables, and figures, may 
30 be encompassed by the present embodiments. 

A fragment of SEQ ID NO:32-62 can comprise a region of unique polynucleotide sequence 
that specifically identifies SEQ ID NO:32-62, for example, as distinct from any other sequence in the 
genome from which the fragment was obtained. A fragment of SEQ ID NO:32-62 can be employed 
in one or more embodiments of methods of the invention, for example, in hybridization and 
35 amplication technologies and in analogous methods that distinguish SEQ ID NO:32-62 from related 
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polynucleotides. The precise length of a fragment of SEQ ID NO:32-62 and the region of SEQ ID 
NO:32-62 to which the fragment corresponds are routinely determinable by one of ordinary sltill in 
the art based on the intended purpose for the fragment 

A fragment of SEQ ID NO:l-31 is encoded by a fragment of SEQ ID NO:32-62. A fragment 

5 of SEQ ID NO: 1-31 can comprise a region of unique amino acid sequence that specifically identifies 
SEQ ID NO:l-31. For example, a fragment of SEQ ID NO:l-31 can be used as an immunogenic 
peptide for the development of antibodies that specifically recognize SEQ ID NO:l-31. The precise 
length of a fragment of SEQ ID NO:l-31 and the region of SEQ ID NO: 1-31 to which the fragment 
corresponds can be determined based on the intended purpose for the fragment using one or more 

10 analytical methods described herein or otherwise lenown in the art. 

A "full length" polynucleotide is one containing at least a translation initiation codon (e.g., 
methionine) followed by an open reading frame and a translation termination codon. A "full length" 
polynucleotide sequence encodes a "full length" polypeptide sequence. 

"Homology" refers to sequence similarity or, alternatively, sequence identity, between two or 

15 more polynucleotide sequences or two or more polypeptide sequences. 

The terms "percent identity" and "% identity," as applied to polynucleotide sequences, refer 
to the percentage of identical residue matches between at least two polynucleotide sequences aligned 
using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible 
way, gaps in the sequences being compared in order to optimize alignment between two sequences, 

20 and therefore achieve a more meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined using one or more 
computer algorithms or programs known in the art or described herein. For example, percent identity 
can be determined using the default parameters of the CLUSTAL V algorithm as incorporated into 
the MEGALIGN version 3.12e sequence alignment program. This programis part of the 

25 LASERGENE software package, a suite of molecular biological analysis programs (DNASTAR, 
Madison WI). CLUSTAL V is described in Higgins, D.G. and P.M. Sharp (1989; CABIOS 5:151- 
153) and in Higgins, D.G. et al. (1992; CABIOS 8:189-191). For pairwise alignments of 
polynucleotide sequences, the default parameters are set as follows: Ktuple=2, gap penalty=5, 
window=4, and "diagonals saved"=4. The "weighted" residue weight table is selected as the default. 

30 Alternatively, a suite of commonly used and freely available sequence comparison algorithms 

which can be used is provided by the National Center for Biotechnology Information (NCBI) Basic 
Local Alignment Search Tool (BLAST) (Altschul, S.F. et al. (1990) J. Mol. Biol. 215:403-410), 
which is available from several sources, including the NCBI, Bethesda, MD, and on the Internet at 
http://www.ncbi.nlmnih.gov/BLAST/. The BLAST software suite includes various sequence 

35 analysis programs including "blastn," that is used to align a known polynucleotide sequence with 
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other polynucleotide sequences from a variety of databases. Also available is a tool called "BLAST 2 
Sequences" that is used for direct pairwise comparison of two nucleotide sequences. "BLAST 2 
Sequences" canbe accessed and used interactively at http://www.ncbi.nlmnm.gov/gor^]2Jitinl. 
The "BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below) BLAST 
5 programs are commonly used with gap and other parameters set to default settings. For example to 
compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 
2.0. 12 (April-21-2000) set at default parameters. Such default parameters may be, for example: 
Matrix: BLOSUM62 
Reward for match: 1 
10 Penalty for mismatch: -2 

Open Gap: 5 and Extension Gap: 2 penalties 
Gap x drop-off: 50 
Expect: 10 
Word Size: 11 
15 Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example 
as defined by a particular SEQ ID number, or may be measured over a shorter length, for example ' 
over the lengfc of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
least 20, at least 30, at least 40, atleastSO, at least 70, at least 100, or at least 200 contiguous 
20 nucleotides. Such lengths are exemplary only, and it is understood that any fragment length 

supported by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to 
describe a length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
simuar ammo acid sequences due to the degeneracy of the genetic code. It is understood that changes 
25 m a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
sequences that all encode substantially the same protein. 

The phrases "percent identity" and "ft identity," as applied to polypeptide sequences, refer to 
the percentage of identical residue matches between at least two polypeptide sequences aligned using 
a standardized algorithm Methods of polypeptide sequence ahgnment are well-know. Some 
30 ahgnment methods take into account conservative amino acid substitutions. Such conservative 

substitutions, explained in more detail above, generally preserve the charge and hydrophobicity at the 
sxte of substitution, thus preserving the structure (and therefore function) of the polypeptide The 
Phrases "percent simuarity" and "ft similarity," as applied to polypeptide sequences, refer to the 
percentage of residue matches, including identical residue matches and conservative substitutions 
35 between at least two polypeptide sequences aligned using a standardized algorithm hi contrast 
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conservative substitutions are not included in the calculation of percent identity "between polypeptide 
sequences. 

Percent identity between polypeptide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e 
5 sequence alignment program (described and referenced above). For pairwise alignments of 

polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap 
penalty=3, window=5, and "diagonals saved"=5. The PAM250 matrix is selected as the default 
residue weight table. 

Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 
10 comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 
2.0. 12 (April-2 1-2000) with blastp set at default parameters. Such default parameters may be, for 
example: 

Matrix: BLOSUM62 

Open Gap: 11 and Extension Gap: 1 penalties 
15 Gap x drop-off: 50 

Expect: 10 
Word Size: 3 
Filter: on 

Percent identity may be measured over the length of an entire defined polypeptide sequence, 
20 for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for 
example, over the length of a fragment taken from a larger, defined polypeptide sequence, for 
instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 
150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment 
length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be 
25 used to describe a length over which percentage identity may be measured. 

"Human artificial chromosomes" (HACs) are linear microchromosomes which may contain 
DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

The term "humanized antibody" refers to an antibody molecule in which the amino acid 
30 sequence in the non-antigen binding regions has been altered so that the antibody more closely 
resembles a human antibody, and still retains its original binding ability. 

"Hybridization" refers to the process by which a polynucleotide strand anneals with a 
complementary strand through base pairing under defined hybridization conditions. Specific 
hybridization is an indication that two nucleic acid sequences share a high degree of complementarity. 
35 Specific hybridization complexes form under permissive annealing conditions and remain hybridized 
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after the "washing" step(s). The washing step(s) is particularly important in determining the 
stringency of the hybridization process, with more stringent conditions allowing less non-specific 
binding, i.e., binding between pairs of nucleic acid strands that are not perfectly matched. Permissive 
conditions for annealing of nucleic acid sequences are routinely determinable by one of ordinary skill 
5 in the art and may be consistent among hybridization experiments, whereas wash conditions may be 
varied among experiments to achieve the desired stringency, and therefore hybridization specificity. 
Permissive annealing conditions occur, for example, at 68°C in me presence of about 6 x SSC, about 
1% (w/v) SDS, and about 100 fig/ml sheared, denatured salmon spermDNA. 

Generally, stringency of hybridization is expressed, in part, with reference to the temperature 
10 under which the wash step is carried out. Such wash temperatures are typically selected to be about 
5°C to 20°C lower than the thermal melting point (T J for the specific sequence at a denned ionic 
strength and pH. The T m is the temperature (under defined ionic strength and pH) at which 50% of 
. the target sequence hybridizes to a perfectly matched probe. An equation for calculating T ra and 
conditions for nucleic acid hybridization are well known and can be found in Sambrook, J. and D.W. 
15 Russell (2001; Molecular Cloning: A Laboratory Mannal 3rd ed., vol. 1-3, Cold Spring Harbor Press, 
Cold Spring Harbor NY, ch. 9). 

High stringency conditions for hybridization between polynucleotides of the present 
invention include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1 % SDS, 
for 1 hour. Alternatively, temperatures of about 65°C, 60°C, 55°C, or 42°C may be used. SSC 
20 concentrauonmaybevariedfromaboutO.l to2xSSC, with SDS being present at about 0.1%. 
Typically, blocking reagents are used to block non-specific hybridization. Such blocking reagents 
include, for instance, sheared and denatured salmon spermDNA at about 100-200 /ig/ml. Organic 
solvent, such as formamide at a concentration of about 35-50% v/v, may also be used under particular 
circumstances, such as for RNArDNA hybridizations. Useful variations on these wash conditions 
25 will be readily apparent to those of ordinary skill in the art. Hybridization, particularly under high 
stringency conditions, may be suggestive of evolutionary similarity between the nucleotides. Such 
similarity is strongly indicative of a similar role for the nucleotides and their encoded polypeptides. 

The term "hybridization complex" refers to a complex formed between two nucleic acids by 
virtue of the formation of hydrogen bonds between complementary bases. Ahybridization complex 
30 may be formed in solution (e.g., C 0 t or R„t analysis) or formed between one nucleic acid present in 
solution and another nucleic acid immobilized on a solid support (e.g., paper, membranes, filters, 
chips, pins or glass slides, or any other appropriate substrate to which cells or their nucleic acids have 
been fixed). 

The words "insertion" and "addition" refer to changes in an amino acid or polynucleotide 
35 sequence resulting in Ihe addition of one or more amino acid residues or nucleotides, respectively. 
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"Immune response" can refer to conditions associated with inflammation, trauma, immune 
disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors, e.g., cytokines, chemokines, and other signaling molecules, which may affect 
cellular and systemic defense systems. 
5 An "immunogenic fragment" is a polypeptide or oligopeptide fragment of PMMM which is 

capable of eliciting an immune response when introduced into a living organism, for example, a 
mammal. The term "immunogenic fragment" also includes any polypeptide or oligopeptide fragment 
of PMMM which is useful in any of the antibody production methods disclosed herein or known in 
the art. 

10 The term "microarray" refers to an arrangement of a plurality of polynucleotides, 

polypeptides, antibodies, or other chemical compounds on a substrate. 

The terms "element" and "array element" refer to a polynucleotide, polypeptide, antibody, or 
other chemical compound having a unique and defined position on a microarray. 

The term "modulate" refers to a change in the activity of PMMM. For example, modulation 
15 may cause an increase or a decrease in protein activity, binding characteristics, or any other 
biological, functional, or immunological properties of PMMM. 

The phrases "nucleic acid" and "nucleic acid sequence" refer to a nucleotide, oligonucleotide, 
polynucleotide, or any fragment thereof. These phrases also refer to DNA or RN A of genomic or 
synthetic origin which may be singje-stranded or double-stranded and may represent the sense or the 
20 antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material. 

"Operably linked" refers to the situation in which a first nucleic acid sequence is placed in a 
functional relationship with a second nucleic acid sequence. For instance, a promoter is operably 
linked to a coding sequence if the promoter affects the transcription or expression of the coding 
sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where 
25 necessary to join two protein coding regions, in the same reading frame. 

"Peptide nucleic acid" (PNA) refers to an antisense molecule or anti-gene agent which 
comprises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide backbone of 
amino acid residues ending in lysine. The terminal lysine confers solubility to the composition. 
PNAs preferentially bind complementary single stranded DNA or RN A and stop transcript 
30 elongation, and may be pegylated to extend their lifespan in the cell. 

"Post-translational modification" of an PMMM may involve lipidation, glycosylation, 
phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in 
the art. These processes may occur synthetically or biochemically. Biochemical modifications will 
vary by cell type depending on the enzymatic milieu of PMMM. 
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"Probe" refers to nucleic acids encoding PMMM, their complements, or fragments thereof, 
which are used to detect identical, allelic or related nucleic acids. Probes are isolated 
oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical 
labels include radioactive isotopes, ligands, chernfluminescent agents, and enzymes. "Primers" are 
5 short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide 
by complementary base-pairing. The primer may then be extended along the target DNA strand by a 
DNA polymerase enzyme. Primer pairs can be used for amplification (and identification) of a nucleic 
acid, e.g., by the polymerase chain reaction (PCR). 

Probes and primers as used in the present invention typically comprise at least 15 contiguous 
10 nucleotidesofaknownsequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers that comprise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 100 
oratleastUOconsecutivenucleotidesofmedisclosednucleicacidsequences. Probes and primers 
may be considerably longer lhan these examples, and it is understood that any length supported by the 
specification, including the tables, figures, and Sequence Listing, may be used. 
15 Methods for preparing and using probes and primers are described in, for example, 

Sambrook, J. and D.W. Russell (2001; Molecular Clonic A Laboratory Manual 3rd ed., vol. 1-3, 
Cold Spring Harbor Press, Cold Spring Harbor NY), Ausubel, F.M. et al. (1999; Short Protocols in 
Molecular Bioloov , 4' ed. , John Wiley & Sons, New York NY), and Innis, M. et al. (1990; PCR 
Protocols, A guide to Methods and Ap plication, Academic Press, San Diego CA). PCR primer pairs 
20 can be derived from a known sequence, for example, by using computer programs intended for that 
purpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge 
MA). 6 

Oligonucleotides for use as primers are selected using software known in the art for such 
purpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 

:5 100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 
5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer 
selection programs have incorporated additional features for expanded capabilities. For example, the 
PrimOU primer selection program (available to the public from the Genome Center at University of 
Texas South West Medical Center, Dallas TX) is capable of choosing specific primers from 

0 megabase sequences and is thus useful for designing primers on a genome-wide scope. The Primer3 
primer selection program (available to the public from the Whitehead Institute/MIT Center for 
Genome Research, Cambridge MA) allows the user to input a "rraspriming library," in which 
sequences to avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the 
selection of oligonucleotides for mcroarrays. (The source code for the latter two primer selection 

- programs may also be obtained from their respective sources and modified to meet the user's specific 
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needs.) The PrimeGen program (available to the public from the UK Human Genome Mapping 
Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, 
thereby allowing selection of primers that hybridize to either the most conserved or least conserved 
regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both 
5 unique and conserved oligonucleotides and polynucleotide fragments. The oligonucleotides and 
polynucleotide fragments identified by any of the above selection methods are useful in hybridization 
technologies, for example, as PCR or sequencing primers, microarray elements, or specific probes to 
identify fully or partially complementary polynucleotides in a sample of nucleic acids. Methods of 
oligonucleotide selection are not limited to those described above. 
10 A "recombinant nucleic acid" is a nucleic acid that is not naturally occurring or has a 

sequence that is made by an artificial combination of two or more otherwise separated segments of 
sequence. This artificial combination is often accomplished by chemical synthesis or, more 
commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic 
engineering techniques such as those described in Sambrook and Russell (supra). The term 
15 recombinant includes nucleic acids that have been altered solely by addition, substitution, or deletion 
of a portion of the nucleic acid. Frequently, a recombinant nucleic acid may include a nucleic acid 
sequence operably linked to a promoter sequence. Such a recombinant nucleic acid may be part of a 
vector that is used, for example, to transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a 
20 vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
expressed, inducing a protective immunological response in the mammal. 

A "regulatory element" refers to a nucleic acid sequence usually derived from untranslated 
regions of a gene and includes enhancers, promoters, introns, and 5 1 and 3' untranslated regions 
(UTRs). Regulatory elements interact with host or viral proteins which control transcription, 
25 translation, or RNA stability. 

"Reporter molecules" are chemical or biochemical moieties used for labeling a nucleic acid, 
amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 
chemQunrinescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and 
other moieties known in the art. 
30 An "RNA equivalent," in reference to a DNA molecule, is composed of the same linear 

sequence of nucleotides as the reference DNA molecule with the exception that all occurrences of the 
nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose 
instead of deoxyribose. 

The term "sample" is used in its broadest sense. A sample suspected of containing PMMM, 
35 nucleic acids encoding PMMM, or fragments thereof may comprise a bodily fluid; an extract from a 
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cell, chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA RNA, or 
cDNA, in solution or bound to a substrate; a tissue; a tissue print; etc. 

The terms "specific binding" and "specifically binding" refer to that interaction between a 
protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
5 synthetic binding composition. The interaction is dependent upon the presence of a particular 
structure of the protein, e.g., the antigenic determinant or epitope, recognized by the binding 
molecule. For example, if an antibody is specific for epitope "A," the presence of a polypeptide 
comprising the epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A 
and the antibody win reduce the amount of labeled A that binds to the antibody. 
10 The term "substantially purified" refers to nucleic acid or amino acid sequences that are 

removed from their natural environment and are isolated or separated, and are at least about 60% free, 
preferably at least about 75% free, and most preferably at least about 90% free from other 
components with which they are naturally associated. 

A "substitution" refers to the replacement of one or more amino acid residues or nucleotides 
15 by different amino acid residues or nucleotides, respectively. 

"Substrate" refers to any suitable rigid or semi-rigid support including membranes, filters, 
chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, 
trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound 
20 A "transcript image" or "expression profile" refers to the collective pattern of gene 

expression by a particular cell type or tissue under given conditions at a given time. 

"Transformation" describes a process by which exogenous DNA is introduced into a recipient 
cell. Transformationmay occur under natural or artificial conditions according to various methods 
well known in the art, and may rely on any known method for the insertion of foreign nucleic acid 
25 sequencesmtoaprokaryoticoreukaryotichostcell. The method for transformation is selected based 
on the type of host cell being transformed and may include, but is not limited to, bacteriophage or 
viral infection, electroporation, heat shock, lipofection, and particle bombardment. The term 
"transformed cells" includes stably transformed cells in which the inserted DNA is capable of 
replication either as an autonomously replicating plasmid or as part of the host chromosome, as well 
30 as transiently transformed cells which express the inserted DNA or RNA for limited periods of time. 
A "transgenic organism," as used herein, is any organism, including but not limited to 
animals and plants, in which one or more of the cells of the organism contains heterologous nucleic 
acid introduced by way of human intervention, such as by transgenic techniques well known in the 
art. The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor 
35 of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with 
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a recombinant virus. In another embodiment, the nucleic acid can be introduced by infection with a 
recombinant viral vector, such as a lentiviral vector (Lois, C. et al. (2002) Science 295:868-872). The 
term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but 
rather is directed to the introduction of a recombinant DNA molecule. The transgenic organisms 

5 contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, plants 
and animals. The isolated DNA of the present invention can be introduced into the host by methods 
known in the art, for example infection, transfection, transformation or transconjugation. Techniques 
for transferring the DNA of the present invention into such organisms are widely known and provided 
in references such as Sambrook and Russell {supra), 

10 A "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having 

at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91 %, at least 92%, at least 

15 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater 
sequence identity over a certain defined length. A variant may be described as, for example, an 
"allelic" (as defined above), "splice," "species," or "polymorphic" variant. A splice variant may have 
significant identity to a reference molecule, but will generally have a greater or lesser number of 
polynucleotides due to alternate splicing of exons during mRNA processing. The corresponding 

20 polypeptide may possess additional functional domains or lack domains that are present in the 

reference molecule. Species variants are polynucleotides that vary from one species to another. The 
resulting polypeptides will generally have significant amino acid identity relative to each other. A 
polymorphic variant is a variation in the polynucleotide sequence of a particular gene between 
individuals of a given species. Polymorphic variants also may encompass "single nucleotide 

25 polymorphisms" (SNPs) in which the polynucleotide sequence varies by one nucleotide base. The 
presence of SNPs maybe indicative of, for example, a certain population, a disease state, or a 
propensity for a disease state. 

A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having 
at least 40% sequence identity or sequence similarity to the particular polypeptide sequence over a 

30 certain length of one of the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool 
Version 2.0.9 (May-07-1999) set at default parameters. Such a pair of polypeptides may show, for 
example, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 
9 1 %, at least 92%, at least 93 %, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, 
or at least 99% or greater sequence identity or sequence similarity over a certain defined length of one 

35 of the polypeptides. 
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THE INVENTION 

Various embodiments of the invention include new human protein modification and 
maintenance molecules (PMMM), the polynucleotides encoding PMMM, and the use of these 
compositions for the diagnosis, treatment, or prevention of gastrointestinal, cardiovascular, 
autoiminune/inflammatory, cell proliferative, developmental, epithelial, neurological, reproductive, 
endocrine, metabolic, pancreatic disorders, disorders associated with the adrenals, disorders 
associated with gonadal steroid hormones, cancers, and infections. 

Table 1 summarizes the nomenclature for me full length polynucleotide and polypeptide 
embodiments of the invention. Each polynucleotide and its corresponding polypeptide are correlated 
to a single Incyte project identification number (Incyte Project ID). Each polypeptide sequence is 
denoted by both a polypeptide sequence identification number (Polypeptide SEQ ID NO:) and an 
Incyte polypeptide sequence number (Incyte Polypeptide ID) as shown. Each polynucleotide 
sequence is denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ 
ID NO:) and an Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as 
15 shown. Column 6 shows the Incyte ID numbers of physical, mil length clones corresponding to the 
polypeptide and polynucleotide sequences of the invention. The full length clones encode 
polypeptides which have at least 95% sequence identity to the polypeptide sequences shown in 
column 3. 

Table 2 shows sequences with homology to the polypeptides of the invention as identified by 
20 BLAST analysis against the GenBank protein (genpept) database and the PROTEOME database. 
Columns 1 and 2 show the polypeptide sequence identification number (Polypeptide SEQ ID NO:) 
and the corresponding Incyte polypeptide sequence number (Incyte Polypeptide ID) for polypeptides 
of the invention. Column 3 shows the GenBank identification number (GenBank ID NO:) of the 
nearest GenBank homolog and the PROTEOME database identification numbers (PROTEOME ID 
NO:) of the nearest PROTEOME database homologs. Column 4 shows the probability scores for the 
matches between each polypeptide and its homolog(s). Column 5 shows the annotation of the 
GenBank and PROTEOME database homolog(s) along with relevant citations where applicable, all of 
which are expressly incorporated by reference herein. 

Table 3 shows various structural features of the polypeptides of the invention. Columns 1 
and 2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding 
Incyte polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. 
Column 3 shows the number of amino acid residues in each polypeptide. Column 4 shows potential 
phosphorylation sites, and column 5 shows potential glycosylate sites, as determined by the 
MOTIFS program of the GCG sequence analysis software package (Accelrys, Burlington MA). 
35 Column 6 shows amino acid residues comprising signature sequences, domains, and motifs. Column 



25 



30 



43 



WO 03/025131 



PCT/US02/29221 



7 shows analytical methods for protein structure/function analysis and in some cases, searchable 
databases to which the analytical methods were applied. 

Together, Tables 2 and 3 summarize the properties of polypeptides of the invention, and these 
properties establish that the claimed polypeptides are protein modification and maintenance 

5 molecules. For exanqple, SEQ ED NO:2 is 86% identical, from residue Ml to residue E738 and 96% 
identical, from residue K607 to residue L900, to human inter-alpha-trypsin inhibitor family heavy 
chain-related protein (GenBank ID g409684Q) as determined by the Basic Local Alignment Search 
Tool (BLAST). (See Table 2.) The BLAST probability scores are 0.0 and 7.3e-152, which indicate 
the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ ED NO:2 

10 also contains a von WiHebrand factor type A domain as determined by searching for statistically 

significant matches in the hidden Markov model (HMM)-based PFAM database of conserved protein 
family domains. (See Table 3.) Data fromBLIMPS, MOTIFS, and additional BLAST analyses 
provide further corroborative evidence that SEQ ED NO:2 is a protease inhibitor. 

In another example, SEQ ID NO:9 is 50% identical, from residue Ml to residue G378, to 

15 Mus musculus mDj 10 (GenBank ID g6567 172) as determined by BLAST. The BLAST probability 
score is 9.7e-102. SEQ ID NO:9 also contains a DnaJ domain as determined by searching for 
statistically significant matches in the hidden Markov model (HMM)-based PFAM database. Data 
fromBLIMPS, MOTIFS, and PROFILES CAN analyses provide further corroborative evidence that 
SEQ ID NO:9 is a molecular chaperone. 

20 In another example, SEQ ID NO: 12 is 100% identical, from residue Ml to residue N344, to 

human phosphatidyl inositol glycan class T (GenBank ID gl4456615) as determined by BLAST. The 
BLAST probability score is 5.4e-280. Data fromBLAST-PRODOM analysis provides further 
corroborative evidence that SEQ ED NO: 12 is a phosphatidyl inositol glycan. In an alternative 
example, SEQ ID NO:13 is 100% identical, from residue D63 to residue L476, to human 

25 phosphatidyl inositol glycan class T (GenBank ID gl4456615) as determined by BLAST. The 
BLAST probability score is 4.7e-261. Data fromBLAST-PRODOM analysis provides further 
corroborative evidence that SEQ ID NO:13 is a phosphatidyl inositol glycan. 

In yet another example, SEQ ID NO:15 is 97% identical, from residue D50 to residue D121, 
to human ubiquitin-conjugating enzyme HR6B (GenBank ID gl 1037550) as determined by BLAST. 

30 The BLAST probability score is 2. le-58. SEQ ID NO:15 is localized to the subcellular region, has 
ubiquitination function, and is a protein conjugation factor as determined by BLAST analysis using 
the PROTEOME database. SEQ ID NO:15 also contains an ubiquitin-conjugating enzyme domain as 
determined by searching for statistically significant matches in the hidden Markov model (HMM)- 
based PFAM database. Data fromBLAST-PRODOM, BLAST-DOMO, and PROFILESCAN 
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analyses provide further corroborative evidence that SEQ ID NO:15 is a ubiquitin-conjugating 
enzyme. 

In a further example, SEQIDNO:19is 100% identical, from residue Ml to residue G82, and 
100% identical, fromresidue G82 to residue A652, to the large subunit of human CANP (GenBank 
5 ID g29664, residues M1-G82 and G144-A714 respectively) as determined by BLAST. TheBLAST 
probability score is 0.0. SEQ ID N0.19 is homologous to other proteins, such as calpain, the large 
subunit of a cysteine protease, having cysteine protease activity and localized to the plasma 
membrane, as determined by BLAST analysis using the PROTEOME database. SEQ ID NO:19 also 
contains calpain and EF hand domains as determined by searching for statistically significant matches 
10 in the hidden Markov model (HMM)-based PFAM database of conserved protein family domains 

Data fromBLIMPS, MOTIFS, and BLAST analyses provide further corroborative evidence that SEQ 
ID NO:19 is a calpain cysteine protease. SEQ fl)NO:l, SEQ IDNO:3-8, SEQ ID NO:10-11 SEQ ID 
NO:14, SEQ ID NO.16-18, and SEQ ID NO:20-31 were analyzed and annotated in a similar manner. 
The algorithms and parameters for the analysis of SEQ ID NO:l-31 are described in Table 7. 
15 As shown in Table 4, the full length polynucleotide embodiments were assembled using 

cDNA sequences or coding (exon) sequences derived from genomic DNA or any combination of 
these two types of sequences. Column 1 lists the polynucleotide sequence identification number 
(Polynucleotide SEQ ID NO:), the corresponding Incyte polynucleotide consensus sequence number 
(Incyte ID) for each polynucleotide of the invention, and the length of each polynucleotide sequence 
20 inbasepairs. Column 2 shows the nucleotide start (5') and stop (3') positions of the cDNA and/or 
genomic sequences used to assemble the full length polynucleotide embodiments, and of fragments of 
the polynucleotides which are useful, for example, in hybridization or amplification technologies that 
identify SEQ ID NO.32-62 or that distinguish between SEQ ID NO:32-62 and related 
polynucleotides. 

25 The polynucleotide fragments described in Column 2 of Table 4 may refer specifically, for 

example, to Incyte cDNAs derived from tissue-specific cDNA libraries or from pooled cDNA ' 
libraries. Alternatively, the polynucleotide fragments described in column 2 may refer to GenBank 
cDNAs or ESTs which contributed to the assembly of the full length polynucleotides. In addition, the 
polynucleotide fragments described in column 2 may identify sequences derived from the ENS EMBL 

30 (The Sanger Centre, Cambridge, UK) database (Le., those sequences including the designation 
"ENST"). Alternatively, the polynucleotide fragments described in column 2 may be derived from 
the NCBI RefSeq Nucleotide Sequence Records Database (Le. , those sequences including the 
designation "NM" or "NT") or the NCBI RefSeq Protein Sequence Records (le., those sequences 
including the designation "NP"). Alternatively, the polynucleotide fragments described in column 2 
35 may refer to assemblages of both cDNA and Genscan-predicted exons brought together by an «exon 
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stitching" algorithm For example, a polynucleotide sequence identified as 
FL JOOOOa_N j J^ 2 _YYYYYJJ 3 _N 4 represents a "stitched" sequence in which XXXXXX is the 
identification number of the cluster of sequences to which the algorithm was applied, and YYYYY is 
the number of the prediction generated by the algorithm, and N I 2f 3„., if present, represent specific 

5 exons that may have been manually edited during analysis (See Example V). Alternatively, the 
polynucleotide fragments in column 2 may refer to assemblages of exons brought together by an 
"exon-stretching" algorithm. For example, a polynucleotide sequence identified as 
FLXXXXXXl_gAAAAA j*BBBBB_l_N is a "stretched" sequence, with XXXXXX being the Incyte 
project identification number, gAAAAA being the GenBank identification number of the human 

10 genomic sequence to which the "exon-stretching" algorithm was applied, gBBBBB being the 
GenBank identification number or NCBI RefSeq identification number of the nearest GenBank 
protein homolog, and N referring to specific exons (See Example V). In instances where a RefSeq 
sequence was used as a protein homolog for the "exon-stretching" algorithm, a RefSeq identifier 
(denoted by "NM," "NP," or "NT") may be used in place of the GenBank identifier (Le. , gBBBBB). 

15 Alternatively, a prefix identifies component sequences that were hand-edited, predicted from 

genomic DNA sequences, or derived from a combination of sequence analysis methods. The 
following Table lists examples of component sequence prefixes and corresponding sequence analysis 
methods associated with the prefixes (see Example IV and Example V). 



Prefix 


Type of analysis and/or examples of programs 


GNN, GFG, 
ENST 


Exon prediction from genomic sequences using, for example, 
GENSCAN (Stanford University, CA, USA) or FGENES 
(Computer Genomics Group, The Sanger Centre, Cambridge, UK). 


GBI 


Hand-edited analysis of genomic sequences. 


FL 


Stitched or stretched genomic sequences (see Example V). 


INCY 


Full length transcript and exon prediction from mapping of EST 
sequences to the genome. Genomic location and EST composition 
data are combined to predict the exons and resulting transcript. 



25 

In some cases, Incyte cDNA coverage redundant with the sequence coverage shown in Table 
4 was obtained to confirm the final consensus polynucleotide sequence, but the relevant Incyte cDNA 
identification numbers are not shown. 

Table 5 shows the representative cDNA libraries for those full length polynucleotides which 
30 were assembled using Incyte cDNA sequences. The representative cDNA library is the Incyte cDNA 
library which is most frequently represented by the Incyte cDNA sequences which were used to 
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assemble and confirm the above polynucleotides. The tissues and vectors which were used to 
construct the cDNA libraries shown in Table 5 are described in Table 6. 

Table 8 shows single nucleotide polymorphisms (SNPs) found in polynucleotide sequences of 
the invention, along with allele frequencies in different human populations. Columns 1 and 2 show 
5 the polynucleotide sequence identification number (SEQ ID NO:) and the corresponding Incyte 
project identification number (PID) for polynucleotides of the invention. Column 3 shows the Incyte 
identification number for the EST in which the SNP was detected (EST ID), and column 4 shows the 
identification number for me SNP (SNP ID). Column 5 shows the position within the EST sequence 
at which the SNP is located (EST SNP), and column 6 shows the position of the SNP within the full- 
10 length polynucleotide sequence (CB1 SNP). Column 7 shows the allele found in the EST sequence. 
Columns 8 and 9 show the two alleles found at the SNP site. Column 10 shows the amino acid 
encoded by the codon including the SNP site, based upon the allele found in the EST. Columns 11- 
14showthe frequency of allele 1 in four different human populations. Anentry of n/d (not detected) 
indicates that the frequency of allele 1 in the population was too low to be detected, while n/a (not 
15 available) indicates mat the allele frequency was not determined for the population. 

The invention also encompasses PMMM variants. Various embodiments of PMMM variants 
can have at least about 80%, at least about 90%, or at least about 95% amino acid sequence identity to 
the PMMM amino acid sequence, and can contain at least one functional or structural characteristic 
of PMMM. 

20 Various embodiments also encompass polynucleotides which encode PMMM. In a particular 

embodiment, Ihe invention encompasses a polynucleotide sequence comprising a sequence selected 
from the group consisting of SEQ ID NO:32-62, which encodes PMMM. The polynucleotide 
sequences of SEQ ID NO:32-62, as presented in the S equence Listing, embrace the equivalent RNA 
sequences, wherein occurrences of the nitrogenous base thymine are replaced with uracil, and the 
25 sugar backbone is composed of ribose instead of deoxyribose. 

The invention also encompasses variants of a polynucleotide encoding PMMM. In particular, 
such a variant polynucleotide will have at least about 70%, or alternatively at least about 85%, or 
even at least about 95% polynucleotide sequence identity to a polynucleotide encoding PMMM. A 
particular aspect of the invention encompasses a variant of a polynucleotide comprising a sequence 
30 selected from the group consisting of SEQ ID NO:32-62 which has at least ahout 70%, or 

alternatively at least about 85%, or even at least about 95% polynucleotide sequence identity to a 
nucleic acid sequence selected from the group consisting of SEQ ID NO:32-62. Any one of the 
polynucleotide variants described above can encode a polypeptide which contains at least one 
functional or structural characteristic of PMMM. 
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In addition, or in the alternative, a polynucleotide variant of the invention is a splice variant 
of a polynucleotide encoding PMMM. A splice variant may have portions which have significant 
sequence identity to a polynucleotide encoding PMMM, but will generally have a greater or lesser 
number of polynucleotides due to additions or deletions of blocks of sequence arising from alternate 
5 splicing of exons during mRNA processing. A splice variant may have less than about 70%, or 
alternatively less than about 60%, or alternatively less than about 50% polynucleotide sequence 
identity to a polynucleotide encoding PMMM over its entire length; however, portions of the splice 
variant will have at least about 70%, or alternatively at least about 85%, or alternatively at least about 
95%, or alternatively 100% polynucleotide sequence identity to portions of the polynucleotide 
10 encoding PMMM. For example, a polynucleotide comprising a sequence of SEQ ID NO:43 and a 
polynucleotide comprising a sequence of SEQ ID NO:44 are splice variants of each other. Any one 
of the splice variants described above can encode a polypeptide which contains at least one functional 
or structural characteristic of PMMM. 

It will be appreciated by those skilled in the art that as a result of the degeneracy of the 
15 genetic code, a multitude of polynucleotide sequences encoding PMMM, some bearing minimal 
similarity to the polynucleotide sequences of any known and naturally occurring gene, may be 
produced Thus, the invention contemplates each and every possible variation of polynucleotide 
sequence that could be made by selecting combinations based on possible codon choices. These 
combinations are made in accordance with the standard triplet genetic code as applied to the 
20 polynucleotide sequence of naturally occurring PMMM, and all such variations are to be considered 
as being specifically disclosed. 

Although polynucleotides which encode PMMM and its variants are generally capable of 
hybridizing to polynucleotides encoding naturally occurring PMMM under appropriately selected 
conditions of stringency, it may be advantageous to produce polynucleotides encoding PMMM or its 
25 derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally occurring 
codons. Codons may be selected to increase the rate at which expression of the peptide occurs in a 
particular prokaryotic or eukaryotic host in accordance with the frequency with which particular 
codons are utilized by the host. Other reasons for substantially altering the nucleotide sequence 
encoding PMMM and its derivatives without altering the encoded amino acid sequences include the 
30 production of RNA transcripts having more desirable properties, such as a greater half-life, than 
transcripts produced from the naturally occurring sequence. 

The invention also encompasses production of polynucleotides which encode PMMM and 
PMMM derivatives, or fragments thereof, entirely by synthetic chemistry. After production, the 
synthetic polynucleotide may be inserted into any of the many available expression vectors and cell 
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systems using reagents well known in the art. Moreover, synthetic chemistry may be used to 
introduce mutations into a polynucleotide encoding PMMM or any fragment thereof. 

Embodiments of the invention can also include polynucleotides that are capable of 
hybridizing to the claimed polynucleotides, and, in particular, to those having the sequences shown in 
5 SEQ ED NO.-32-62 and fragments thereof, under various conditions of stringency (Wahl, G.M. and 
S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 
152:507-511). Hybridization conditions, including annealing and wash conditions, are described in 
"Definitions." 

Methods for DNA sequencing are well known in the art and may be used to practice any of 
10 the embodiments of the invention. The methods may employ such enzymes as the Klenow fragment 
of DNA polymerase I, SEQUENASE (US Biochemical, Cleveland OH), Taq polymerase (Applied 
Biosystems), thermostable T7 polymerase (Amersham Biosciences, Piscataway NJ), or combinations 
of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification 
systemdnvitrogen, Carlsbad CA). Preferably, sequence preparationis automated with machines such 
15 as the MICROLAB 2200 liquid transfer system (Hamilton, Reno NV), PTC200 thermal cycler (MJ 
Research, Watertown MA) and ABI CATALYST 800 thermal cycler (Applied Biosystems). 
Sequencing is then carried out using either the ABI 373 or 377 DNA sequencing system (Applied 
Biosystems), the MEGABACE 1000 DNA sequencing system (Amersham Biosciences), or other 
systems known in the art. The resulting sequences are analyzed using a variety of algorithms which 
20 are well known in the art (Ausubel et al., supra, ch. 7; Meyers, R.A. (1995) Molecular BioW ^ 
Biotechnology , Wiley VCH, New York NY, pp. 856-853). 

The nucleic acids encoding PMMM may be extended utilizing a partial nucleotide sequence 
and employing various PCR-based methods known in the art to detect upstream sequences, such as 
promoters and regulatory elements. For example, one method which may be employed, 
25 restriction-site PCR, uses universal and nested primers to amplify unknown sequence from genomic 
DNA within a cloning vector (Sarkar, G. (1993) PCR Methods Applic. 2:318-322). Another method 
inverse PCR, uses primers that extend in divergent directions to amplify unknown sequence from a 
carcularized template. The template is derived from restriction fragments comprising a known 
genomic locus and surrounding sequences (Triglia, T. et al. (1988) Nucleic Acids Res. 16:8186) A 
30 third method, capture PCR, involves PCR amplification of DNA fragments adjacent to known 
sequences inhuman and yeast artificial chromosome DNA (Lagerstrom, M. et al. (1991) PC R 
Methods Applic. 1:111-119). In this method, multiple restriction enzyme digestions andligations 
may be used to insert an engineered double-stranded sequence into a region of unknown sequence 
before performing PCR. Other methods which may be used to retrieve unknown sequences are 
35 laiownintheart(Parker,J.D.etal.(1991)NucleicAcidsRe S . 19:3055-3060). Additionally onemay 
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use PCR, nested primers, and PROMOTERFENDER libraries (Clontech, Palo Alto CA) to walk 
genomic DNA. This procedure avoids the need to screen libraries and is useful in finding intron/exon 
junctions. For all PCR-based methods, primers may be designed using commercially available 
software, such as OLIGO 4.06 primer analysis software (National Biosciences, Plymouth MN) or 

5 another appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of 
about 50% or more, and to anneal to the template at temperatures of about 68°C to 72°C. 

When screening for full length cDNAs, it is preferable to use libraries that have been 
size-selected to include larger cDNAs. In addition, random-primed libraries, which often include 
sequences containing the 5' regions of genes, are preferable for situations in which an oligo d(T) 

10 library does not yield a full-length cDNA Genomic libraries may be useful for extension of sequence 
into 5' non-transcribed regulatory regions. 

Capillary electrophoresis systems which are commercially available may be used to analyze 
the size or confirm the nucleotide sequence of sequencing or PCR products. In particular, capillary 
sequencing may employ flowable polymers for electrophoretic separation, four different nucleotide- 

15 specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
emitted wavelengths. Output/light intensity may be converted to electrical signal using appropriate 
software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, Applied Biosystems), and the entire 
process from loading of samples to computer analysis and electronic data display maybe computer 
controlled. Capillary electrophoresis is especially preferable for sequencing small DNA fragments 

20 which may be present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotides or fragments thereof which encode 
PMMM may be cloned in recombinant DNA molecules that direct expression of PMMM, or 
fragments or functional equivalents thereof, in appropriate host cells. Due to the inherent degeneracy 
of the genetic code, other polynucleotides which encode substantially the same or a functionally 

25 equivalent polypeptides may be produced and used to express PMMM. 

The polynucleotides of the invention can be engineered using methods generally known in the 
art in order to alter PMMM-encoding sequences for a variety of purposes including, but not limited 
to, modification of the cloning, processing, and/or expression of the gene product. DNA shuffling by 
random fragmentation and PCR reassembly of gene fragments and synthetic oligonucleotides maybe 

30 used to engineer the nucleotide sequences. For example, oligonucleotide-mediated site-directed 

mutagenesis may be used to introduce mutations that create new restriction sites, alter glycosylation 
patterns, change codon preference, produce splice variants, and so forth. 

The nucleotides of the present invention may be subjected to DNA shuffling techniques such 
as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent No. 

35 5,837,458; Chang, C.-C. et aL (1999) Nat. Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat. 
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25 



Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat Biotechnol. 14:315-319) to alter or 
improve the biological properties of PMMM, such as its biological or enzymatic activity or its ability 
tobindtoothermoleculesorcompounds. DNA shuffling is a process by which a library ofgene 
variants is produced using PCR-mediated recombination of gene fragments. The library is then 
subjected to selection or screening procedures that identify those gene variants with the desired 
properties. These preferred variants may men be pooled and further subjected to recursive rounds of 
DNA shuffling and selection/screening. Thus, genetic diversity is created through "artificial- 
breeding and rapid molecular evolution. For example, fragments of a single gene containing random 
point mutations may be recombined, screened, and men reshuffled until the desired properties are 
optimized. Alternatively, fragments of a given gene may be recombined with fragments of 
homologous genes in the same gene family, either from the same or different species, thereby 
maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable 



manner. 



In another embodiment, polynucleotides encoding PMMM may be synthesized, in whole or in 
15 part, using one or more chemical methods well known in the art (Caruthers, M.H. et al. (1980) 
Nucleic Acids Symp. Ser. 7:215-223; Horn, T. et al. (1980) Nucleic Acids Syrup. Ser. 7:225-232). 
Alternatively, PMMM itself or a fragment Ihereof may be synthesized using chemical methods known 
in the art. For example, peptide synthesis can be performed using various solution-phase or 
solid-phase techniques (Creighton, T. (1984) Proteins. Structures and Molecular Pmr ^c wh 
Freeman, New York NY, pp. 55-60; Roberge, J.Y. et al. (1995) Science 269:202-204). Automated 
synthesis may be achieved using the ABI 431 A peptide synthesizer (Applied Biosystems). 
Additionally, the amino acid sequence of PMMM, or any part thereof, may be altered during direct 
synthesis and/or combined with sequences from other proteins, or any part thereof, to produce a 
variant polypeptide or a polypeptide having a sequence of a naturally occurring polypeptide. 

The peptide may be substantially purified by preparative high performance liquid 
chromatography (Chiez, R.M. and F.Z. Regnier (1990) Methods Enzymol. 182:392-421). The 
composition of the synthetic peptides may be confirmed by amino acid analysis or by sequencing 
(Creighton, supra, pp. 28-53). 

In order to express a biologically active PMMM, the polynucleotides encoding PMMM or 
30 derivatives thereof may be inserted into an appropriate expression vector, i.e., a vector which contains 
the necessary elements for transcriptional and translational control of the inserted coding sequence in 
a suitable host These elements include regulatory sequences, such as enhancers, constitutive and 
inducible promoters, and 5' and 3' untranslated regions in the vector and in polynucleotides encoding 
PMMM. Suchderr^tsmayvarymmekstrengmandspecificity. Specific initiation signals may 
35 also be used to achieve more efficient translation of polynucleotides encoding PMMM. Such signals 
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include the ATG initiation codon and adjacent sequences, e.g. the Kozak sequence. In cases where a 
polynucleotide sequence encoding PMMM and its initiation codon and upstream regulatory 
sequences are inserted into the appropriate expression vector, no additional transcriptional or 
translational control signals may be needed. However, in cases where only coding sequence, or a 

5 fragment thereof, is inserted, exogenous translational control signals including an in-frame ATG 
initiation codon should be provided by the vector. Exogenous translational elements and initiation 
codons may be of various origins, both natural and synthetic. The efficiency of expression may be 
enhanced by the inclusion of enhancers appropriate for the particular host cell system used (Scharf, 
D. et al. (1994) Results Probl. Cell Differ. 20:125-162). 

10 Methods which are well known to those skilled in the art may be used to construct expression 

vectors containing polynucleotides encoding PMMM and appropriate transcriptional and translational 
control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, 
and in vivo genetic recombination (Sambrook and Russell, supra, ch. 1-4, and 8; Ausubel et al., 
supra, ch. 1, 3, and 15). 

15 A variety of expression vector/host systems may be utilized to contain and express 

polynucleotides encoding PMMM. These include, but are not limited to, microorganisms such as 
bacteria transformed with recombinant bacteriophage, plasnid, or cosnrid DNA expression vectors; 
yeast transformed with yeast expression vectors; insect cell systems infected with viral expression 
vectors (e.g., baculovirus); plant cell systems transformed with viral expression vectors (e.g., 

20 cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV) or with bacterial expression vectors 
(e.g., Ti or pBR322 plasmids); or animal cell systems (Sambrook and Russell, supra; Ausubel et al., 
supra; Van Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 264:5503-5509; Engelhard, E.K. et al. 
(1994) Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum Gene Ther. 7:1937- 
1945; Takamatsu, N. (1987) EMBO J. 6:307-311; The McGraw Hill Yearbook of Science and 

25 Technology (1992) McGraw Hill, New York NY, pp. 191-196; Logan, J. and T. Shenk (1984) Proc. 
Natl. Acad. Sci. USA 81:3655-3659; Harrington, J J. et al. (1997) Nat. Genet. 15:345-355). 
Expression vectors derived from retroviruses, adenoviruses, or herpes or vaccinia viruses, or from 
various bacterial plasmids, may be used for delivery of polynucleotides to the targeted organ, tissue, 
or cell population (Di Nicola, M. et al. (1998) Cancer Gen. Ther. 5:350-356; Yu, M. et al. (1993) 

30 Proc. Natl. Acad. Sci. USA 90:6340-6344; Buller, R.M. et al. (1985) Nature 317:813-815; McGregor, 
D.R et al. (1994) Mol. Immunol. 31:219-226; Verma, I.M. and N. Somia (1997) Nature 389:239- 
242). The invention is not limited by the host cell employed. 

In bacterial systems, a number of cloning and expression vectors may be selected depending 
upon the use intended for polynucleotides encoding PMMM. For example, routine cloning, 

35 subcloning, and propagation of polynucleotides encoding PMMM can be achieved using a 
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multifunctional E. coli vector such as PBLUESCRIPT (Stratagene, La Jolla CA) or PSPORT1 
plasmid (Invitrogen). Ligation of polynucleotides encoding PMMM into (he vector's multiple 
cloning site disrupts the lacZ gene, allowing a colorimettic screening procedure for identification of 
transformed bacteria containing recombinant molecules. In addition, these vectors may be useful for 
5 in vitro transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of 
nested deletions in the cloned sequence (Van Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 
264:5503-5509). When large quantities of PMMM are needed, e.g. for the production of antibodies, 
vectors which direct high level expression of PMMM may be used. For example, vectors containing 
the strong, inducible SP6 or 17 bacteriophage promoter may be used. 
10 Yeast expression systems may be used for production of PMMM. A number of vectors 

containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH 
promoters, may be used in the yeast Saccharomyces cerevisiae or Pichia pastoris. In addition, such 
vectors direct either the secretion or intracellular retention of expressed proteins and enable 
integration of foreign polynucleotide sequences into the host genome for stable propagation (Ausubel 
15 et al., supra; Bitter, G.A et al. (1987) Methods Enzymol. 153:516-544; Scorer, CA et al. (1994) 
Bio/Technology 12:181-184). 

Plant systems may also be used for expression of PMMM. Transcription of polynucleotides 
encoding PMMM may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used 
alone or in combination with the omega leader sequence fromTMV (Takamatsu, N. (1987) EMBO J. 
20 6:307-311). Alternatively, plant promoters such as the smaUsubunit of RUBISCO or heat shock 
promoters may be used (Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, R. et al. (1984) 
Science 224:838-843; Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105). These 
constructs can be introduced into plant cells by direct DNA transformation or pathogen-mediated 
transfection (The McGraw Hfll Yearbook o f Science and Technolo g y (1992) McGraw Hill, New 
25 York NY, pp. 191-196). 

In mammalian cells, a number of viral-based expression systems may be utilized. In cases 
where an adenovirus is used as an expression vector, polynucleotides encoding PMMM may be 
ligated into an adenovirus transcription/translation complex consisting of the late promoter and 
tripartite leader sequence. Insertion in a non-essential El or E3 region of the viral genome may be 

30 used to obtain infective virus which expresses PMMM in host cells (Logan, J. and T. Shenk (1984) 
Proc. Natl. Acad. Sci. USA 81 :3655-3659). In addition, transcription enhancers, such as the Rous 
sarcoma virus (RS V) enhancer, may be used to increase expression in mammalian host cells. S V40 
or EBV-based vectors may also be used for high-level protein expression 

Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of 

35 DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 
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constructed and delivered via conventional delivery methods (liposomes, polycationic amino 
polymers, or vesicles) for therapeutic purposes (Harrington, J.J. et al. (1997) Nat Genet. 15:345-355). 

For long term production of recombinant proteins in mammalian systems, stable expression 
of PMMM in cell lines is preferred. For example, polynucleotides encoding PMMM can be 

5 transformed into cell lines using expression vectors which, may contain viral origins of replication 
and/or endogenous expression elements and a selectable marker gene on the same or on a separate 
vector. Following the introduction of the vector, cells may be allowed to grow for about 1 to 2 days 
in enriched media before being switched to selective media. The purpose of the selectable marker is 
to confer resistance to a selective agent, and its presence allows growth and recovery of cells which 

10 successfully express the introduced sequences. Resistant clones of stably transformed cells may be 
propagated using tissue culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex virus thymidine kinase and adenine 
phosphoribosyltransferase genes, for use in tk and apf cells, respectively (Wigler, M. et al. (1977) 

15 Cell 11:223-232; Lowy, I. et al. (1980) Cell 22:817-823). Also, antimetabolite, antibiotic, or 

herbicide resistance can be used as the basis for selection. For example, dhfr confers resistance to 
methotrexate; neo confers resistance to the aminoglycosides neomycin and G-418; and als and pat 
confer resistance to chlofsulfuron and phosphinotricin acetyltransferase, respectively (Wigler, M. et 
al. (1980) Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. Mol. Biol. 

20 150:1-14). Additional selectable genes have been described, e.g., trpB and hisD, which alter cellular 
requirements for metabolites (Hartman, S.C. and R.C. Mulligan (1988) Proc. Natl. Acad. Sci. USA 
85:8047-8051). Visible markers, e.g., anthocyanins, green fluorescent proteins (GFP; Clontech), 0- 
glucuronidase and its substrate 0-glucuronide, or luciferase and its substrate luciferin may be used. 
These markers can be used not only to identify transformants, but also to quantify the amount of 

25 transient or stable protein expression attributable to a specific vector system (Rhodes, C.A. (1995) 
Methods Mol. Biol. 55:121-131). 

Although the presence/absence of marker gene expression suggests that the gene of interest is 
also present, the presence and expression of the gene may need to be confirmed. For example, if the 
sequence encoding PMMM is inserted within a marker gene sequence, transformed cells containing 

30 polynucleotides encoding PMMM can be identified by the absence of marker gene function. 

Alternatively, a marker gene can be placed in tandem with a sequence encoding PMMM under the 
control of a single promoter. Expression of the marker gene in response to induction or selection 
usually indicates expression of the tandem gene as well. 

In general, host cells that contain the polynucleotide encoding PMMM and that express 

35 PMMM may be identified by a variety of procedures known to those of skill in the art. These 
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procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCR 
amplification, and protein bioassay or immunoassay techniques which include membrane, solution, or 
chip based technologies for Ihe detection and/or quantification of nucleic acid or protein sequences. 
Immunological methods for detecting and measuring the expression of PMMM using either 
5 specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques 
include en2yme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and 
fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing 
monoclonal antibodies reactive to two non-interfering epitopes on PMMM is preferred, but a 
competitive binding assay may be employed. These and other assays are well known in the art 
10 (Hampton, R. et al. (1990) Serological Me thods, a Laboratory Manual . APS Press, St Paul MN, Sect. 
IV; Coligan, J.E. et al. (1997) Current Protocols in Tmmnnnln^ Greene Pub. Associates and Wiley- 
Interscience, New York NY; Pound, J.D. (1998) Imm^h^^i Humana Press, Totowa 

NJ). 

A wide variety of labels and conjugation techniques are known by those skilled in the art and 
15 may be used in various nucleic acid and amino acid assays. Means for producing labeled 

hybridization or PCR probes for detecting sequences related to polynucleotides encoding PMMM 
include oligolabeling, nick translation, end-labeling, or PCR amplification using a labeled nucleotide. 
Alternatively, polynucleotides encoding PMMM, or any fragments thereof, may be cloned into a 
' vector for the production of an mRNA probe. Such vectors are known in the art, are commercially 
20 available, and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA 
polymerase such as T7, T3, or SP6 and labeled nucleotides. These procedures may be conducted 
using a variety of commercially available kits, such as those provided by Amersham Biosciences, 
Promega (Madison WI), and US Biochemical. Suitable reporter molecules or labels which may be 
used for ease of detection include radionuclides, enzymes, fluorescent, chemiluminescent, or 
25 chromogenic agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the like. 

Host cells transformed with polynucleotides encoding PMMM may be cultured under 
conditions suitable for the expression and recovery of the protein from cell culture. The protein 
produced by a transformed cell may be secreted or retained intraceUularly depending on the sequence 
and/or the vector used. As will be understood by those of skill in the art, expression vectors 
30 containing polynucleotides which encode PMMM may be designed to contain signal sequences which 
direct secretion of PMMM through a prokaryotic or eukaryotic cell membrane. 

In addition, a host cell strain may be chosen for its ability to modulate expression of the 
inserted polynucleotides or to process the expressed protein in the desired fashion Such 
modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, 
glycosylation, phosphorylation, lipidation, and acylation Post-translational processing which cleaves 
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a "prepro" or "pro" form of the protein may also be used to specify protein targeting, folding, and/or 
activity. Different host cells which have specific cellular machinery and characteristic mechanisms 
for post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available from the 
American Type Culture Collection (ATCC, Manassas VA) and may be chosen to ensure the correct 

5 modification and processing of the foreign protein. 

In another embodiment of the invention, natural, modified, or recombinant polynucleotides 
encoding PMMM may be ligated to a heterologous sequence resulting in translation of a fusion 
protein in any of the aforementioned host systems. For example, a chimeric PMMM protein 
containing a heterologous moiety that can be recognized by a commercially available antibody may 

10 facilitate the screening of peptide libraries for inhibitors of PMMM activity. Heterologous protein 
and peptide moieties may also facilitate purification of fusion proteins using commercially available 
affinity matrices. Such moieties include, but are not limited to, glutathione S-transferase (GST), 
maltose binding protein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP), 6-His, FLAG, 
c-myc y and hemagglutinin (HA). GST, MBP, Trx, CBP, and 6-His enable purification of their 

15 cognate fusion proteins on immobilized glutathione, maltose, phenylarsine oxide, calmodulin, and 
metal-chelate resins, respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity 
purification of fusion proteins using commercially available monoclonal and polyclonal antibodies 
that specifically recognize these epitope tags. A fusion protein may also be engineered to contain a 
. proteolytic cleavage site located between the PMMM encoding sequence and the heterologous protein 

20 sequence, so that PMMM may be cleaved away from the heterologous moiety following purification. 
Methods for fusion protein expression and purification are discussed in Ausubel et al. (supra, ch. 10 
and 16). A variety of commercially available kits may also be used to facilitate expression and 
purification of fusion proteins. 

In another embodiment, synthesis of radiolabeled PMMM may be achieved in vitro using the 

25 TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These systems couple 

transcription and translation of protein-coding sequences operably associated with the T7, T3, or SP6 
promoters. Translation takes place in the presence of a radiolabeled amino acid precursor, for 
example, 35 S -methionine. 

PMMM, fragments of PMMM, or variants of PMMM may be used to screen for compounds 

30 that specifically bind to PMMM. One or more test compounds may be screened for specific binding 
to PMMM. In various embodiments, 1,2,3, 4, 5, 10, 20, 50, 100, or 200 test compounds can be 
screened for specific binding to PMMM. Examples of test compounds can include antibodies, 
anticalins, oligonucleotides, proteins (e.g., ligands or receptors), or small molecules. 

In related embodiments, variants of PMMM can be used to screen for binding of test 

35 compounds, such as antibodies, to PMMM, a variant of PMMM, or a combination of PMMM and/or 
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one or more variants PMMM. In an embodiment, a variant of PMMM can be used to screen for 
compounds that bind to a variant of PMMM, but not to PMMM having me exact sequence of a 
sequence of SEQ ID NO:l-31. PMMM variants used to perform such screening canhave a range of 
about 50% to about 99% sequence identity to PMMM, with various embodiments having 60%, 70%, 
5 75%, 80%, 85%, 90%, and 95% sequence identity. 

In an embodiment, a compound identified in a screen for specific binding to PMMM can be 
closely related to the natural ligand of PMMM, e.g., a ligand or fragment thereof, a natural substrate, 
a structural or functional mimetic, or a natural binding partner (Coligan, J.E. et al. (1991) Current 
Protocols inlTTmrnnolopy l(2):Chapter 5). In another embodiment, the compound thus identified can 
10 be a natural ligand of a receptor PMMM (Howard, A.D. et al. (2001) Trends Pharmacol. Sci.22:132- 
140; Wise, A. et al. (2002) Drug Discovery Today 7:235-246). 

In other embodiments, a compound identified in a screen for specific binding to PMMM can 
be closely related to the natural receptor to which PMMM binds, at least a fragment of the receptor, 
or a fragment of the receptor including all or a portion of the ligand binding site or binding pocket ' 
15 For example, the compound may be a receptor for PMMM which is capable of propagating a signal, 
or a decoy receptor for PMMM which is not capable of propagating a signal (Ashkenazi, A. and V.M. 
Divit (1999) Curr. Opin. Cell Biol. 1 1:255-260; Mantovani, A. et al. (2001) Trends Immunol. 22:328- 
336). The compound can be rationally designed using known techniques. Examples of such 
techniques include those used to construct the compound etanercept (ENBREL; Amgen Inc., 
-0 Thousand Oaks CA), which is efficacious for treating rheumatoid arthritis in humans. Etanerceptis 
an engineered P 75 tumor necrosis factor (TNF) receptor dimer linked to the Fc portion of human IgG 1 
(Taylor, P.C. et al. (2001) Curr. Opin. Immunol. 13:61 1-616). 

In one embodiment, two or more antibodies having similar or, alternatively, different 
specificities can be screened for specific binding to PMMM, fragments of PMMM or variants of 
> PMMM. The binding specificity of the antibodies thus screened can thereby be selected to identify 
particular fragments or variants of PMMM. In one embodiment, an antibody can be selected such 
that its binding specificity allows for preferential identification of specific fragments or variants of 
PMMM. In another embodiment, an antibody can be selected such that its binding specificity allows 
for preferential diagnosis of a specific disease or condition having increased, decreased, or otherwise 
abnormal production of PMMM. 

In an embodiment, anticalins can be screened for specific binding to PMMM, fragments of 
PMMM, or variants of PMMM. Anticalins are ligand-binding proteins that have been constructed 
based on a lipocalin scaffold (Weiss, G.A. and H.B. Lowman (2000) Chem Biol. 7:R177-R184; 
Skerra, A. (2001) J. Biotechnol. 74:257-275). The protein architecture of lipocalins can include a 
beta-barrel having eight antiparallel beta-strands, which supports four loops at its open end. These 
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loops form the natural ligand-binding site of the lipocalins, a site which can he re-engineered in vitro 
by amino acid substitutions to impart novel binding specificities. The amino acid substitutions can be 
made using methods known in the art or described herein, and can include conservative substitutions 
(e.g., substitutions that do not alter binding specificity) or substitutions that modestly, moderately, or 

5 significantly alter binding specificity. 

In one embodiment, screening for compounds which specifically bind to, stimulate, or inhibit 
PMMM involves producing appropriate cells which express PMMM, either as a secreted protein or 
on the cell membrane. Preferred cells can include cells from mammals, yeast, Drosophila, or E. colL 
Cells expressing PMMM or cell membrane fractions which contain PMMM are then contacted with a 

10 test compound and binding, stimulation, or inhibition of activity of either PMMM or the compound is 
analyzed. 

An assay may simply test binding of a test compound to the polypeptide, wherein binding is 
detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. For example, 
the assay may comprise the steps of combining at least one test compound with PMMM, either in 
15 solution or affixed to a solid support, and detecting the binding of PMMM to the compound. 
Alternatively, the assay may detect or measure binding of a test compound in the presence of a 
labeled competitor. Additionally, the assay may be carried out using cell-free preparations, chemical 
libraries, or natural product mixtures, and the test compound(s) may be free in solution or affixed to a 
solid support 

20 An assay can be used to assess the ability of a compound to bind to its natural ligand and/or 

to inhibit the binding of its natural ligand to its natural receptors. Examples of such assays include 
radio-labeling assays such as those described in U.S. Patent No. 5,914,236 and U.S. Patent No. 
6,372,724. In a related embodiment, one or more amino acid substitutions can be introduced into a 
polypeptide compound (such as a receptor) to improve or alter its ability to bind to its natural ligands 

25 (Matthews, DJ. and J.A. Wells. (1994) Chem. Biol. 1 :25-30). In another related embodiment, one or 
more amino acid substitutions can be introduced into a polypeptide compound (such as a ligand) to 
improve or alter its ability to bind to its natural receptors (Cunningham, B.C. and J.A. Wells (1991) 
Proc. Natl. Acad. Sci. USA 88:3407-3411; Lowman, H.B. et at. (1991) J. Biol. Chem. 266:10982- 
10988). 

30 PMMM, fragments of PMMM, or variants of PMMM may be used to screen for compounds 

that modulate the activity of PMMM. Such compounds may include agonists, antagonists, or partial 
or inverse agonists. In one embodiment, an assay is performed under conditions permissive for 
PMMM activity, wherein PMMM is combined with at least one test compound, and the activity of 
PMMM in the presence of a test compound is compared with the activity of PMMM in the absence of 

35 the test compound. A change in the activity of PMMM in the presence of the test compound is 
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indicative of a compound that modulates the activity of PMMM. Alternatively, a test compound is 
combined with an in vitro or cell-free system comprising PMMM under conditions suitable for 
PMMM activity, and the assay is performed. In either of these assays, a test compound which 
modulates the activity of PMMM may do so indirectly and need not come in direct contact with the 
test compound. At least one and up to a plurality of test compounds may be screened. 

In anoflier embodiment, polynucleotides encoding PMMM or their mammalian homologs 
may be "knocked out" in an animal model system using homologous recombination in embryonic 
stem (ES) cells. Such techniques are well known in the art and are useful for the generation of animal 
models of human disease (see, e.g., U.S. Patent No. 5,175,383 and U.S. Patent No. 5,767,337). For 
example, mouse ES cells, such as the mouse 129/SvJ cell line, are derived from me early mouse 
embryo and grown in culture. The ES cells are transformed with a vector containing the gene of 
interest disrupted by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. 
(1989) Science 244:1288-1292). The vector integrates into the corresponding region of the host 
genome by homologous recombination. Alternatively, homologous recombination takes place using 
15 the Cre-loxP system to knockout a gene of interest in a tissue- or developmental stage-specific 

manner (Marth, J.D. (1996) Clin. Invest. 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids 
Res. 25:4323-4330). Transformed ES cells are identified and nricroinjected into mouse cell 
blastocysts such as those fromlhe C57BL/6 mouse strain. The blastocysts are surgically transferred 
to pseudopregnant dams, and the resulting chimeric progeny are genotyped and bred to produce 
20 heterozygous or homozygous strains. Transgenic animals thus generated may be tested with potential 
therapeutic or toxic agents. 

Polynucleotides encoding PMMM may also be manipulated in vitro in ES cells derived from 
human blastocysts . Human ES cells have the potential to differentiate into at least eight separate cell 
lineages including endodenn, mesoderm and ectodermal cell types. These cell lineages differentiate 
25 into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et al. 
(1998) Science 282: 1 145-1 147). 

Polynucleotides encoding PMMM can also be used to create "knockin" humanized animals 
(pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a 
region of a polynucleotide encoding PMMM is injected into animal ES cells, and the injected 
sequence integrates into the animal cell genome. Transformed cells are injected into blastulae, and 
the blastulae are implanted as described above. Transgenic progeny or inbred lines are studied and 
treated with potential pharmaceutical agents to obtain information on treatment of a human disease. 
Alternatively, a mammal inbred to overexpress PMMM e.g., by secreting PMMM in its milk, may 
also serve as a convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev 4 55- 
35 74). 
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THERAPEUTICS 

Chemical and structural similarity, e.g., in the context of sequences and motifs, exists 
between regions of PMMM and protein modification and maintenance molecules. In addition, 
examples of tissues expressing PMMM can be found in Table 6 and can also be found in Example XI. 

5 Therefore, PMMM appears to play a role in gastrointestinal, cardiovascular, 

autoimmune/inflammatory, cell proliferative, developmental, epithelial, neurological, reproductive, 
endocrine, metabolic, pancreatic disorders, disorders associated with the adrenals, disorders 
associated with gonadal steroid hormones, cancers, and infections. In the treatment of disorders 
associated with increased PMMM expression or activity, it is desirable to decrease the expression or 

10 activity of PMMM. In the treatment of disorders associated with decreased PMMM expression or 
activity, it is desirable to increase the expression or activity of PMMM. 

Therefore, in one embodiment, PMMM or a fragment or derivative thereof may be 
administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of PMMM. Examples of such disorders include, but are not limited to, a gastrointestinal 

15 disorder, such as dysphagia, peptic esophagitis, esophageal spasm, esophageal stricture, esophageal 
carcinoma, dyspepsia, indigestion, gastritis, gastric carcinoma, anorexia, nausea, emesis, 
gastroparesis, antral or pyloric edema, abdominal angina, pyrosis, gastroenteritis, intestinal 
obstruction, infections of the intestinal tract, peptic ulcer, cholelithiasis, cholecystitis, cholestasis, 
pancreatitis, pancreatic carcinoma, biliary tract disease, hepatitis, hyperbilirubinemia, cirrhosis, 

20 passive congestion of the liver, hepatoma, infectious colitis, ulcerative colitis, ulcerative proctitis, 
Crohn* s disease, Whipple's disease, Mallory-Weiss syndrome, colonic carcinoma, colonic 
obstruction, irritable bowel syndrome, short bowel syndrome, diarrhea, constipation, gastrointestinal 
hemorrhage, acquired immunodeficiency syndrome (AIDS) enteropathy, jaundice, hepatic 
encephalopathy, hepatorenal syndrome, hepatic steatosis, hemochromatosis, Wilson's disease, alpha^ 

25 antitrypsin deficiency, Reye's syndrome, primary sclerosing cholangitis, liver infarction, portal vein 
obstruction and thrombosis, centrilobular necrosis, peliosis hepatis, hepatic vein thrombosis, veno- 
occlusive disease, preeclampsia, eclampsia, acute fatty liver of pregnancy, intrahepatic cholestasis of 
pregnancy, and hepatic tumors including nodular hyperplasias, adenomas, and carcinomas; a 
cardiovascular disorder, such as arteriovenous fistula, atherosclerosis, hypertension, vasculitis, 

30 Raynaud's disease, aneurysms, arterial dissections, varicose veins, thrombophlebitis and 

phlebothrorribosis, vascular tumors, and complications of thrombolysis, balloon angioplasty, vascular 
replacement, and coronary artery bypass graft surgery, congestive heart failure, ischemic heart 
disease, angina pectoris,, myocardial infarction, hypertensive heart disease, degenerative valvular 
heart disease, calcific aortic valve stenosis, congenitally bicuspid aortic valve, mitral annular 

35 calcification, mitral valve prolapse, rheumatic fever and rheumatic heart disease, infective 
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endocarditis, nonbacterial thrombotic endocarditis, endocarditis of systemic lupus erythematosus, 
carcinoid heart disease, cardiomyopathy, myocarditis, pericarditis, neoplastic heart disease, 
congenital heart disease, and complications of cardiac transplantation; an autoimmune/mflammatory 
disease, such as acquired immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory 
5 distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, 
atherosclerotic plaque rupture, autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune 
polyendocrinopamy-candidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact 
dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, 
episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic 
10 gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's 
thyroiditis, hypereosinophuia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, 
myocardial or pericardial inflammation, osteoarthritis, degradation of articular cartilage, osteoporosis, 
pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, SjSgren's 
syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic 
15 purpura, ulcerative colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and 
extracorporeal circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic infections, and 
trauma; a cell proliferative disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, 
cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal 
hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia; a developmental disorder, 
20 such as renal tabular acidosis, anemia, Cushing's syndrome, achondroplastic dwarfism, Duchenne 
and Becker muscular dystrophy, bone resorption, epilepsy, gonadal dysgenesis, WAGR syndrome 
(Wilms' tumor, aniridia, genitourinary abnormalities, and mental retardation), Smith-Magenis 
syndrome, myelodysplasia syndrome, hereditary myoepithelial dysplasia, hereditary keratodermas, 
hereditary neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, 
25 hydrocephalus, seizure disorders such as Sydenham's chorea and cerebral palsy, spina bifida, 

anencephaly, craniorachischisis, congenital glaucoma, cataract, age-related macular degeneration, and 
sensorineural hearing loss; an epithelial disorder, such as dyshidrotic eczema, allergic contact 
dermatitis, keratosis pilaris, melasma, vitiligo, actinic keratosis, basal cell carcinoma, squamous cell 
carcinoma, seborrheic keratosis, folliculitis, herpes simplex, herpes zoster, varicella, candidiasis, 
30 dermatophytosis, scabies, insect bites, cherry angioma, keloid, dermatofibroma, acrochordons, 
urticaria, transient acantholytic dermatosis, xerosis, eczema, atopic dermatitis, contact dermatitis, 
hand eczema, nummular eczema, lichen simplex chronicus, asteatotic eczema, stasis dermatitis and 
stasis ulceration, seborrheic dermatitis, psoriasis, lichen planus, pityriasis rosea, impetigo, ecthyma, 
dermatophytosis, tinea versicolor, warts, acne vulgaris, acne rosacea, pemphigus vulgaris, pemphigus 
35 foliaceus, paraneoplastic pemphigus, bullous pemphigoid, herpes gestationis, dermatitis 
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herpetiformis, linear IgA disease, epidermolysis bullosa acquisita, dermatomyositis, lupus 
erythematosus, scleroderma and morphea, erythroderma, alopecia, figurate skin lesions, 
telangiectasias, hypopigmentation, hyperpigmentation, vesicles/bullae, exanthems, cutaneous drug 
reactions, papulonodular skin lesions, chronic non-healing wounds, photosensitivity diseases, 

5 epidermolysis "bullosa simplex, epidermolytic hyperkeratosis, epidermolytic and nonepidermolytic 
palmoplantar keratoderma, ichthyosis bullosa of Siemens, ichthyosis exfoliativa, keratosis palmaris et 
plantaris, keratosis palmoplantaris, palmoplantar keratoderma, keratosis punctata, Meesmann's 
corneal dystrophy, pachyonychia congenita, white sponge nevus, steatocystoma multiplex, epidermal 
nevi/epidermolytic hyperkeratosis type, monilethrix, trichothiodystrophy, chronic 

10 hepatitis/cryptogenic cirrhosis, and colorectal hyperplasia; a neurological disorder, such as epilepsy, 
ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, 
Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic 
lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, retinitis 
pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial and 

15 viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial 
thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases 
including kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome, fatal 
familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, 
tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental 

20 retardation and other developmental disorders of the central nervous system including Down 

syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous system disorders, cranial nerve 
disorders, spinal cord diseases, muscular dystrophy and other neuromuscular disorders, peripheral 
nervous system disorders, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and 
toxic myopathies, myasthenia gravis, periodic paralysis, mental disorders including mood, anxiety, 

25 and schizophrenic disorders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, 
diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, 
Tourette's disorder, progressive supranuclear palsy, corticobasal degeneration, and familial 
frontotemporal dementia; and a reproductive disorder, such as infertility, including tubal disease, 
ovulatory defects, and endometriosis, a disorder of prolactin production, a disruption of the estrous 

30 cycle, a disruption of the menstrual cycle, polycystic ovary syndrome, ovarian hyperstimnlation 
syndrome, an endometrial or ovarian tumor, a uterine fibroid, autoimmune disorders, an ectopic 
pregnancy, and teratogenesis; cancer of the breast, fibrocystic breast disease, and galactorrhea; a 
disruption of spermatogenesis, abnormal sperm physiology, cancer of the testis, cancer of the 
prostate, benign prostatic hyperplasia, prostatitis, Peyronie's disease, impotence, carcinoma of the 

35 male breast, and gynecomastia; an endocrine disorder such as a disorder of the hypothalamus and/or 
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pituitary resulting from lesions such as a primary brain tumor, adenoma, infarction associated with 
pregnancy, hypophysectomy, aneurysm, vascular malformation, thrombosis, infection, 
immunological disorder, and complication due to head trauma; a disorder associated with 
hypopituitarism including hypogonadism, Sheehan syndrome, diabetes insipidus, Kallman's disease, 
5 Hand-SchuUer-Christian disease, Letteier-Siwe disease, sarcoidosis, empty sella syndrome, and 
dwarfism; a disorder associated with hyperpituitarism including acromegaly, giantism, and syndrome 
of inappropriate antidiuretic hormone (ADH) secretion (SIADH) often caused by benign adenoma; a 
disorder associated with hypothyroidism including goiter, myxedema, acute thyroiditis associated 
with bacterial infection, subacute thyroiditis associated with viral infection, autoimmune thyroiditis 
10 (Hashimoto's disease), and cretinism; a disorder associated with hyperthyroidism including 

thyrotoxicosis and its various forms, Grave's disease, pretibial myxedema, toxic multinodular goiter, 
thyroid carcinoma, and Plummer's disease; a disorder associated with hyperparathyroidism including 
Conn disease (chronic hypercalemia); a metabolic disorder such as Addison's disease, 
cerebrotendinous xanthomatosis, congenital adrenal hyperplasia, coumarin resistance, cystic fibrosis, 
15 diabetes, fatty hepatocirrhosis, fructose- 1,6-diphosphatase deficiency, galactosemia, goiter, 
glucagonoma, glycogen storage diseases, hereditary fructose intolerance, hyperadrcnalism, 
hypoadrenalism, hyperparathyroidism, hypoparathyroidism, hypercholesterolemia, hyperthyroidism, 
hypoglycemia, hypothyroidism, hyperlipidemia, hyperlipemia, lipid myopathies,. lipodystrophies, 
lysosomal storage diseases, mannosidosis, neuraminidase deficiency, obesity, pentosuria 
20 phenylketonuria, pseudovitamin D-deficiency rickets; a disorder of carbohydrate metabolism such as 
congenital type H dyserythropoietic anemia, diabetes, insulin-dependent diabetes mellitus, 
non-insulin-dependent diabetes mellitus, fructose-l,6-diphosphatase deficiency, galactosemia, 
glucagonoma, hereditary fructose intolerance, hypoglycemia, mannosidosis, neuraminidase 
deficiency, obesity, galactose epimerase deficiency, glycogen storage diseases, lysosomal storage 
15 diseases, fructosuria, pentosuria, and inherited abnormalities of pyruvate metabolism; a disorder of 
lipid metabolism such as fatty liver, cholestasis, primary biliary cirrhosis, carnitine deficiency, 
carnitine palmitoyltransferase deficiency, myoadenylate deaminase deficiency, hypertriglyceridemia, 
lipid storage disorders such Fabry's disease, Gaucher's disease, Niemann-Pick's disease, 
metachromatic leukodystrophy, adrenoleukodystrophy, GM 2 gangliosidosis, and ceroid 
50 lipofuscinosis, abetalipoproteinemia, Tangier disease, hyperUpoproteinemia, diabetes mellitus, 
lipodystrophy, lipomatoses, acute panniculitis, disseminated fat necrosis, adiposis dolorosa, lipoid 
adrenal hyperplasia, minimal change disease, lipomas, atherosclerosis, hypercholesterolemia, 
hypercholesterolemia with hypertriglyceridemia, primary hypoalphalipoproteinemia, hypothyroidism, 
renal disease, liver disease, lecithinxholesterol acyltransferase deficiency, cerebrotendinous 
5 xanthomatosis, sitosterolemia, hypocholesterolemia, Tay-Sachs disease, SandhofPs disease, 
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hyperlipidemia, hyperlipemia, lipid myopathies, and obesity; and a disorder of copper metabolism 
such as Menke's disease, Wilson's disease, and Ehlers-Danlos syndrome type IX; a pancreatic 
disorder such as Type I or Type II diabetes mellitus and associated complications; a disorder 
associated with the adrenals such as hyperplasia, carcinoma, or adenoma of the adrenal cortex, 

5 hypertension associated with alkalosis, amyloidosis, hypokalemia, Cushing's disease, Liddle's 

syndrome, and Arnold-Healy-Gordon syndrome, pheochromocytoma tumors, and Addison's disease; 
a disorder associated with gonadal steroid hormones such as: in women, abnormal prolactin 
production, infertility, endometriosis, perturbation of the menstrual cycle, polycystic ovarian disease, 
hyperprolactinemia, isolated gonadotropin deficiency, amenorrhea, galactorrhea, hermaphroditism, 

10 hirsutism and virilization, breast cancer, and, in post-menopausal women, osteoporosis; and, in men, 
Leydig cell deficiency, male climacteric phase, and germinal cell aplasia, ahypergonadal disorder 
associated with Leydig cell tumors, androgen resistance associated with absence of androgen 
receptors, syndrome of 5 a-reductase, and gynecomastia; a cancer such as adenocarcinoma, leukemia, 
lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal 

15 gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, 
heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 
spleen, testis, thymus, thyroid, and uterus; and an infection caused by a viral agent classified as 
adenovirus, arenavirus, bunyavirus, calicivirus, coronavirus, filovirus, hepadnavirus, herpesvirus, 
flavivirus, orthomyxovirus, parvovirus, papovavirus, paramyxovirus, picoruavirus, poxvirus, 

20 reovirus, retrovirus, rhabdovirus, or togavirus; an infection caused by a bacterial agent classified as 
pneumococcus, staphylococcus, streptococcus, bacillus, corynebacterium, Clostridium, 
meningococcus, gonococcus, listeria, moraxella, kingella, haemophilus, legionella, bordetella, gram- 
negative enterobacterium including shigella, salmonella, or Campylobacter, pseudomonas, vibrio, 
brucella, francisella, yersinia, bartonella, norcardium, actinomyces, mycobacterium, spirochaetale, 

25 rickettsia, chlamydia, or mycoplasma; an infection caused by a fungal agent classified as aspergillus, 
blastomyces, dermatophytes, cryptococcus, coccidioides, malasezzia, histoplasma, or other mycosis- 
causing fungal agent; and an infection caused by a parasite classified as plasmodium or malaria- 
causing, parasitic entamoeba, leishmania, tiypanosoma, toxoplasma, Pneumocystis carinii, intestinal 
protozoa such as giardia, trichomonas, tissue nematode such as trichinella, intestinal nematode such 

30 as ascaris, lymphatic filarial nematode, trematode such as schistosoma, and cestrode such as 
tapeworm. 

In another embodiment, a vector capable of expressing PMMM or a fragment or derivative 
thereof may be administered to a subject to treat or prevent a disorder associated with decreased 
expression or activity of PMMM including, but not limited to, those described above. 
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In a further embodiment, a composition comprising a substantially purified PMMM in 
conjunction with a suitable pharmaceutical carrier maybe administered to a subject to treat or prevent 
a disorder associated with decreased expression or activity of PMMM including, but not limited to, 
those provided above. 

5 In still another embodiment, an agonist which modulates the activity of PMMM may be 

administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of PMMM including, but not limited to, those listed above. 

In a further embodiment, an antagonist of PMMM may be administered to a subject to treat or 
prevent a disorder associated with increased expression or activity of PMMM. Examples of such 

10 disorders include, but are not limited to, those gastrointestinal, cardiovascular, 

autoimmune/inflammatory, cell proliferative, developmental, epithelial, neurological, reproductive, 
endocrine, metabolic, pancreatic disorders, disorders associated with the adrenals, disorders 
associated with gonadal steroid hormones, cancers, and infections described above. In one aspect, an 
antibody which specifically binds PMMM may be used directly as an antagonist or indirectly as a 

15 targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissues which express 
PMMM. 

In an additional embodiment, a vector expressing the complement of the polynucleotide 
encoding PMMM may be administered to a subject to treat or prevent a disorder associated with 
increased expression or activity of PMMM including, but not limited to, those described above. 

20 In other embodiments, any protein, agonist, antagonist, antibody, complementary sequence, 

or vector embodiments may be administered in combination with other appropriate therapeutic 
agents. Selection of the appropriate agents for use in combination therapy may be made by one of 
ordinary skill in the art, according to conventional pharmaceutical principles. The combination of 
therapeutic agents may act synergistically to effect the treatment or prevention of the various 

25 disorders described above. Using this approach, one may be able to achieve therapeutic efficacy with 
lower dosages of each agent, thus reducing the potential for adverse side effects. 

An antagonist of PMMM may be produced using methods which are generally known in the 
art. In particular, purified PMMM may be used to produce antibodies or to screen libraries of 
pharmaceutical agents to identify those which specifically bind PMMM. Antibodies to PMMM may 

30 also be generated using methods that are well known in the art. Such antibodies may include, but are 
not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and 
fragments produced by a Fab expression library. In an embodiment, neutralizing antibodies (i.e., 
those which inhibit dimer formation) can be used therapeutically. Single chain antibodies (e.g., from 
camels or llamas) may be potent en2yme inhibitors and may have application in the design of peptide 
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nrimetics, and in the development of immuno-adsorbents and biosensors (Muyldermans, S. (2001) J. 
Biotechnol. 74:277-302). 

For the production of antibodies, various hosts including goats, rabbits, rats, mice, camels, 
dromedaries, llamas, humans, and others may be immunized by injection with PMMM or with any 

5 fragment or oligopeptide thereof which has immunogenic properties. Depending on the host species, 
various adjuvants may be used to increase immunological response. Such adjuvants include, but are 
not limited to, Freund's, mineral gels such as aluminum hydroxide, and surface active substances such 
as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, KLH, and dinitrophenol. 
Among adjuvants used in humans, BCG (bacilli Calmette-Guerin) and Coiynebacteriwn parvum are 

10 especially preferable. 

It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to 
PMMM have an amino acid sequence consisting of at least about 5 amino acids, and generally will 
consist of at least about 10 amino acids. It is also preferable that these oligopeptides, peptides, or 
fragments are substantially identical to a portion of the amino acid sequence of the natural protein. 

15 Short stretches of PMMM amino acids may be fused with those of another protein, such as KLH, and 
antibodies to the chimeric molecule may be produced. 

Monoclonal antibodies to PMMM may be prepared using any technique which provides for . 
the production of antibody molecules by continuous cell lines in culture. These include, but are not 
limited to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma 

20 technique (Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. Immunol. 

Methods 81:31-42; Cote, RJ. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; Cole, S.P. et al. 
(1984) Mol. Cell Biol. 62:109-120). 

In addition, techniques developed for the production of "chimeric antibodies," such as the 
splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 

25 antigen specificity and biological activity, can be used (Morrison, S.L. et al. (1984) Proc. Natl. Acad. 
Sci. USA 81:6851-6855; Neuberger, M.S. et al. (1984) Nature 312:604-608; Takeda, S. et al. (1985) 
Nature 3 14:452-454). Alternatively, techniques described for the production of single chain 
antibodies may be adapted, using methods known in the art, to produce PMMM-specific single chain 
antibodies. Antibodies with related specificity, but of distinct idiotypic composition, may be 

30 generated by chain shuffling from random combinatorial immunoglobulin libraries (Burton, D.R. 
(1991) Proc. Natl. Acad. Sci. USA 88:10134-10137). 

Antibodies may also be produced by inducing in vivo production in the lymphocyte 
population or by screening immunoglobulin libraries or panels of highly specific binding reagents as 
disclosed in the literature (Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci. USA 86:3833-3837; Winter, 

35 G. et al. (1991) Nature 349:293-299). 
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Antibody fragments which contain specific binding sites for PMMM may also be generated. 
For example, such fragments include, but are not limited to, F(ab) 2 fragments produced by pepsin 
digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of 
the F(ab*)2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and 

5 easy identification of monoclonal Fab fragments with the desired specificity (Huse, W.D. et al. (1989) 
Science 246:1275-1281). 

Various immunoassays may be used for screening to identify antibodies having the desired 
specificity. Numerous protocols for competitive binding or immunoradioinetric assays using either 
polyclonal or monoclonal antibodies with established specificities are well known in the art. Such 

10 immunoassays typically involve the measurement of complex formation between PMMM and its 
specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies 
reactive to two non-interfering PMMM epitopes is generally used, but a competitive binding assay 
may also be employed (Pound, supra). 

Various methods such as Scatchard analysis in conjunction with radioimmunoassay 

15 techniques may be used to assess the affinity of antibodies for PMMM. Affinity is expressed as an 
association constant, K^, which is defined as the molar concentration of PMMM-antibody complex 
divided by the molar concentrations of free antigen and free antibody under equilibrium conditions. 
The determined for a preparation of polyclonal antibodies, which are heterogeneous in their 
affinities for multiple PMMM epitopes, represents the average affinity, or avidity, of the antibodies 

20 for PMMM. The YL, determined for a preparation of monoclonal antibodies, which are monospecific 
for a particular PMMM epitope, represents a true measure of affinity. High-affinity antibody 
preparations with ranging from about 10 9 to 10 12 L/mole are preferred for use in immunoassays in 
which the PMMM-antibody complex must withstand rigorous manipulations. Low-affinity antibody 
preparations with K a ranging from about 10 6 to 10 7 L/mole are preferred for use in 

25 irnmunopuxification and similar procedures which ultimately require dissociation of PMMM, 
preferably in active form, from the antibody (Catty, D. (1988) Antibodies, Volume I: A Practical 
Approach , IRL Press, Washington DC; Liddell, J.E. and A. Cryer (1991) A Practical Guide to 
Monoclonal Antibodies , John Wiley & Sons, New York NY). 

The titer and avidity of polyclonal antibody preparations may be further evaluated to 

30 determine the quality and suitability of such preparations for certain downstream applications. For 
example, a polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, 
preferably 5-10 mg specific antibody/ml, is generally employed in procedures requiring precipitation 
of PMMM-antibody complexes. Procedures for evaluating antibody specificity, titer, and avidity, and 
guidelines for antibody quality and usage in various applications, are generally available (Catty, 

35 supra; Coligan et al., supra). 
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In another embodiment of the invention, polynucleotides encoding PMMM, or any fragment 
or complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene 
expression can be achieved by designing conq>lementary sequences or antisense molecules (DNA, 
RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding 

5 PMMM. Such technology is well known in the art, and antisense oligonucleotides or larger 

fragments can be designed from various locations along the coding or control regions of sequences 
encoding PMMM (Agrawal, S., ed. (1996) Antisense Therapeutics , Humana Press, Totawa NJ). 

In therapeutic use, any gene delivery system suitable for introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be delivered 

10 intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence 
complementary to at least a portion of the cellular sequence encoding the target protein (Slater, J.E. et 
al. (1998) J. Allergy Clin. Immunol. 102:469-475; Scanlon, K.J. et al. (1995) 9:1288-1296). 
Antisense sequences can also be introduced intracellularly through the use of viral vectors, such as 
retrovirus and adeno-associated virus vectors (Miller, A.D. (1990) Blood 76:271; Ausubel et al., 

15 supra; Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63:323-347). Other gene delivery 

mechanisms include liposome-derived systems, artificial viral envelopes, and other systems known in 
the art (Rossi, J.J. (1995) Br. Med. Bull. 51:217-225; Boado, RJ. et al. (1998) J. Phann. Sci. 
87:1308-1315; Morris, M.C. et al. (1997) Nucleic Acids Res. 25:2730-2736). 

In another embodiment of the invention, polynucleotides encoding PMMM may be used for 

20 somatic or germline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency 
(e.g., in the cases of severe combined immunodeficiency (SCID)-X1 disease characterized by X- 
linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined 
immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency 
(Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), 

25 cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum Gene 

Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassanias, familial 
hypercholesterolemia, and hemophilia resulting from Factor VIII or Factor IX deficiencies (Crystal, 
R.G. (1995) Science 270:404-410; Verma, LM. and N. Somia (1997) Nature 389:239-242)), (ii) 
express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated 

30 cell proliferation), or (iii) express a protein which affords protection against intracellular parasites 
(e.g., against human retroviruses, such as human immunodeficiency virus (HTV) (Baltimore, D. 
(1988) Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Natl. Acad. Sci. USA 93:11395-1 1399), 
hepatitis B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides 
brasiliensis; and protozoan parasites such as Plasmodium falciparum and Trypanosoma cruzi). In the 

35 case where a genetic deficiency in PMMM expression or regulation causes disease, the expression of 
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PMMM from an appropriate population of transduced cells may alleviate the clinical manifestations 
caused by the genetic deficiency. 

In a further embodiment of the invention, diseases or disorders caused by deficiencies in 
PMMM are treated by constructing mammalian expression vectors encoding PMMM and introducing 
these vectors by mechanical means into PMMM-deficient cells. Mechanical transfer technologies for 
use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) 
ballistic gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene 
transfer, and (v) the use of DNA transposons (Morgan, R.A. and W.F. Anderson (1993) Annu. Rev. 
Biochem 62:191-217; Ivies, Z. (1997) Cell 91:501-510; Boulay, J.-L. and H. Recipon (1998) Cuit. 
Opin. Biotechnol. 9:445-450). 

Expression vectors that may be effective for the expression of PMMM include, but are not 
limited to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX, PCR2-TOPOTA vectors 
(Invitrogen, Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), 
and PTET-OFF, PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto CA). PMMM 
may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), 
Rous sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or |J-actin genes), (ii) an inducible 
promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and H. Bujard (1992) Proc. Natl. 
Acad. Sci. USA 89:5547-5551; Gossen, M. et al. (1995) Science 268:1766-1769; Rossi, F.M.V. and 
H.M. Blau (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasnrid 
(Invitrogen)); the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; 
Invitrogen); the FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter 
(Rossi,. F.M.V. and H.M. Blau, supra)), or (iii) a tissue-specific promoter or the native promoter of 
the endogenous gene encoding PMMM from a normal individual. 

Commercially available liposome transformation kits (e.g., the PERFECT LIPID 
TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
polynucleotides to target cells in culture and require minimal effort to optimize experimental 
parameters. In the alternative, transformation is performed using the calcium phosphate method 
(Graham, F.L. and AJ. Eb (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. 
(1982)EMBOJ. 1:841-845). The introduction of DNA to primary cells requires modification of 
these standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 
respect to PMMM expression are treated by constructing a retrovirus vector consisting of (i) the 
polynucleotide encoding PMMM under the control of an independent promoter or the retrovirus long 
terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive 
element (RRE) along with additional retrovirus ris-acting RNA sequences and coding sequences 
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required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are 
commercially available (Stratagene) and are based on published data (Riviere, I. et al. (1995) Proc. 
Natl. Acad. Sci. USA 92:6733-6737), incorporated by reference herein. The vector is propagated in 
an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropismfor 

5 receptors on the target cells or a promiscuous envelope protein such as VS Vg (Armentano, D. et al. 
(1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. Virol. 61:1639-1646; Adam, M.A. and 
A.D. Miller (1988) J. Virol. 62:3802-3806; Dull, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R. 
et al. (1998) J. Virol. 72:9873-9880). U.S. Patent No. 5,910,434 to Rigg ("Method for obtaining 
retrovirus packaging cell lines producing high transducing efficiency retroviral supernatant") 

10 discloses a method for obtaining retrovirus packaging cell lines and is hereby incorporated by 
reference. Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4 + T- 
cells), and the return of transduced cells to a patient are procedures well known to persons skilled in 
the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71 :7020- 
7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 71:4707-4716; 

15 Ranga, U. et al. (1998) Proc. Natl. Acad. Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283- 
2290). 

In an embodiment, an adenovirus-based gene therapy delivery system is used to deliver 
polynucleotides encoding PMMM to cells which have one or more genetic abnormalities with respect 
to the expression of PMMM. The construction and packaging of adenovirus-based vectors are well 

20 known to those with ordinary skill in the art. Replication defective adenovirus vectors have proven to 
be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas 
(Csete, M.E. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are 
described in U.S. Patent No. 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), 
hereby incorporated by reference. For adenoviral vectors, see also Antinozzi, P.A. et al. (1999; Annu. 

25 Rev. Nutr. 19:511-544) and Verma, I.M. and N. Somia (1997; Nature 18:389:239-242). 

In another embodiment, a herpes-based, gene therapy delivery system is used to deliver 
polynucleotides encoding PMMM to target cells which have one or more genetic abnormalities with 
respect to the expression of PMMM. The use of herpes simplex virus (HS V)-based vectors may be 
especially valuable for introducing PMMM to cells of the central nervous system, for which HS V has 

30 a tropism The construction and packaging of herpes-based vectors are well known to those with 

ordinary skill in the art. A replication-competent herpes simplex virus (HSV) type 1 -based vector has 
been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 
169:385-395). The construction of a HSV-1 virus vector has also been disclosed in detail in U.S. 
Patent No. 5,804,413 to DeLuca ("Herpes simplex virus strains for gene transfer"), which is hereby 

35 incorporated by reference. U.S. Patent No. 5,804,413 teaches the use of recombinant HSV d92 which 
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consists of a genome containing at least one exogenous gene to be transferred to a cell under the 
control of the appropriate promoter for purposes including human gene therapy. Also taught by this 
patent are the construction and use of recombinant HS V strains deleted for ICP4, ICP27 and ICP22. 
For HSV vectors, see also Goins, W.F. et al. (1999; J. Virol. 73:519-532) and Xu, H. et al. (1994; 
Dev. Biol. 163:152-161). The manipulation of cloned herpesvirus sequences, the generation of 
recombinant virus following the transfection of multiple plasmids containing different segments of 
the large herpesvirus genomes, Hie growth and propagation of herpesvirus, and the infection of cells 
with herpesvirus are techniques well known to those of ordinary skill in the art. 

In another embodiment, an alphavirus (positive, single-stranded RNA virus) vector is used to 
deliver polynucleotides encoding PMMM to target cells. The biology of the prototypic alphavirus, 
Semliki Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based 
on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin. Biotechnol. 9:464-469). During 
alphavirus RNA replication, a subgenomb RNA is generated that normally encodes the viral capsid 
proteins. This subgenomic RNA replicates to higher levels than the mil length genomic RNA, 
resulting in the overproduction of capsid proteins relative to the viral proteins with enzymatic activity 
(e.g., protease and polymerase). Similarly, inserting the coding sequence for PMMM into the 
alphavirus genome in place of the capsid-coding region results in the production of a large number of 
PMMM-coding RNAs and the synthesis of high levels of PMMM in vector transduced cells. While 
alphavirus infection is typically associated with cell lysis within a few days, the ability to establish a 
persistent infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) 
indicates that the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy 
application (Dryga, S.A et al. (1997) Virology 228:74-83). The wide host range of alphaviruses will 
allow the introduction of PMMM into a variety of cell types. The specific transduction of a subset of 
cells in a population may require the sorting of cells prior to transduction. The methods of 
manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA 
transfections, and performing alphavirus infections, are weU known to those with ordinary skill in the 
art. 

Oligonucleotides derived from the transcription initiation site, e.g., between about positions 
-10 and +10 from the start site, may also be employed to inhibit gene expression. Similarly, 
inhibition can be achieved using triple helix base-pairing methodology. Triple helix pairing is useful 
because it causes inhibition of the ability of the double helix to open sufficiently for the binding of 
polymerases, transcription factors, or regulatory molecules. Recent therapeutic advances using 
triplex DNA have been described in the literature (Gee, J.E. et al. (1994) in Huber, B.E. and B.I. Carr, 
Molecular and Tmrnnnnl^ r Approaches . Futura Publishing, ML Kisco NY, pp. 163-177). A 
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complementary sequence or antisense molecule may also be designed to block translation of mRNA 
by preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 
RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme 

5 molecule to complementary target RNA, followed by endonucleolytic cleavage. For example, 
engineered hammerhead motif ribozyme molecules may specifically and efficiently catalyze 
endonucleolytic cleavage of RNA molecules encoding PMMM. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified by 
scanning the target molecule for ribozyme cleavage sites, including the following sequences: GUA, 

10 GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides, 
corresponding to the region of the target gene containing the cleavage site, may be evaluated for 
secondary structural features which may render the oligonucleotide inoperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

15 Complementary ribonucleic acid molecules and ribozymes may be prepared by any method 

known in the art for the synthesis of nucleic acid molecules. These include techniques for chemically 
synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. Alternatively, 
RNA molecules may be generated by in vitro and in vivo transcription of DNA molecules encoding 
PMMM. Such DNA sequences may be incorporated into a wide variety of vectors with suitable RNA 

20 polymerase promoters such as T7 or SP6. Alternatively, these cDNA constructs that synthesize 
complementary RNA, constitutively or inducibly, can be introduced into cell lines, cells, or tissues. 

RNA molecules may be modified to increase intracellular stability and half-life. Possible 
modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3' 
ends of the molecule, or the use of phosphorothioate or T O-methyl rather than phosphodiesterase 

25 linkages within the backbone of the molecule. This concept is inherent in the production of PNAs 
and can be extended in all of these molecules by the inclusion of nontraditional bases such as inosine, 
queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, 
cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous 
endonucleases. 

30 In other embodiments of the invention, the expression of one or more selected 

polynucleotides of the present invention can be altered, inhibited, decreased, or silenced using RNA 
interference (RNAi) or post-transcriptional gene silencing (PTGS) methods known in the art. RNAi 
is a post-transcriptional mode of gene silencing in which double-stranded RNA (dsRNA) introduced 
into a targeted cell specifically suppresses the expression of the homologous gene (Le., the gene 

35 bearing the sequence complementary to the dsRNA). This effectively knocks out or substantially 
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reduces the expression of the targeted gene. PTGS can also be accomplished by use of DNA or DNA 
fragments as well. RNAi methods are described by Fire, A. et al. (1998; Nature 391:806-811) and 
Gura, T. (2000; Nature 404:804-808). PTGS can also be initiated by introduction of a 
complementary segment of DNA into the selected tissue using gene delivery and/or Viral vector 
delivery methods described herein or known in the art. 

RNAi can be induced in mammalian cells by the use of small interfering RNA also known as 
siRNA. SiRNA are shorter segments of dsRNA (typically about 21 to 23 nucleotides in length) that 
result in vivo from cleavage of introduced dsRNA by the action of an endogenous ribonuclease. 
SiRNA appear to be the mediators of the RNAi effect in mammals. The most effective siRNAs 
appear to be 21 nucleotide dsRNAs with 2 nucleotide 3' overhangs. The use of siRNA for inducing 
RNAi in mammaUan cells is described by Elbashir, S.M. et al. (2001 ; Nature 41 1 .494-498). 

SiRNA can either be generated indirectly by introduction of dsRNA into the targeted cell, or 
directly by mammalian transfection methods and agents described herein or known in the art (such as 
hposome-mediated transfection, viral vector methods, or other polynucleotide delivery^mtroductory 
methods). Suitable SiRNAs can be selected by exaniining a transcript of the target polynucleotide 
(e.g., mRNA) for nucleotide sequences downstream from the AUG start codon and recording the 
occurrence of each nucleotide and the 3' adjacent 19 to 23 nucleotides as potential siRNA target sites,, 
with sequences having a 21 nucleotide length being preferred. Regions to be avoided for target 
siRNA sites include the 5' and 3 'untranslated regions (UTRs) and regions near the start codon (within 
75 bases), as these may be richer in regulatory protein binding sites. UTR-binding proteins and/or 
translation initiation complexes may interfere with binding of the siRNP endonuclease complex. The 
selected target sites for siRNA can then be compared to the appropriate genome database (e.g, 
human, etc.) using BLAST or other sequence comparison algorithms known in the art. Target 
sequences with significant homology to other coding sequences can be eliminated from consideration. 
The selected SiRNAs can be produced by chemical synthesis methods known in the art or by in vitro 
transcription using commercially available methods and kits such as the SILENCER siRNA 
construction kit (Ambion, Austin TX). 

In alternative embodiments, long-term gene silencing and/or RNAi effects can be induced in 
selected tissue using expression vectors that continuously express siRNA. This can be accomplished 
using expression vectors that are engineered to express hairpin RNAs (shRNAs) using methods 
known in the art (see, e.g., Brummelkamp, T.R. et al. (2002) Science 296:550-553; and Paddison, P.J. 
et al. (2002) Genes Dev. 16:948-958). In these and related embodiments, shRNAs can be delivered to 
target cells using expression vectors known in the art. An example of a suitable expression vector for 
delivery of siRNA is the PSILENCER1 .0-U6 (circular) plasmid (Ambion). Once delivered to the 
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target tissue, shRNAs are processed in vivo into siRNA-like molecules capable of carrying out gene- 
specific silencing. 

In various embodiments, the expression levels of genes targeted by RNAi or PTGS methods 
can be determined by assays for mRNA and/or protein analysis. Expression levels of the mRNA of a 
5 targeted gene, can be determined by northern analysis methods using, for example, the 

NORTHERNMAX-GLY kit (Ambion); by microarray methods; by PCR methods; by real time PCR 
methods; and by other RNA/polynucleotide assays known in the art or described herein. Expression 
levels of the protein encoded by the targeted gene can be determined by Western analysis using . 
standard techniques known in the art. 
10 An additional embodiment of the invention encompasses a method for screening for a 

compound which is effective in altering expression of a polynucleotide encoding PMMM. 
Compounds which may be effective in altering expression of a specific polynucleotide may include, 
but are not limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming 
oligonucleotides, transcription factors and other polypeptide transcriptional regulators, and non- 
15 macromolecular chemical entities which are capable of interacting with specific polynucleotide 

sequences. Effective compounds may alter polynucleotide expression by acting as either inhibitors or 
promoters of polynucleotide expression. Thus, in the treatment of disorders associated with increased 
PMMM expression or activity, a compound which specifically inhibits expression of the 
polynucleotide encoding PMMM may be therapeutically useful, and in the treatment of disorders 
20 associated with decreased PMMM expression or activity, a compound which specifically promotes 
expression of the polynucleotide encoding PMMM may be therapeutically useful. 

In various embodiments, one or more test compounds may be screened for effectiveness in 
altering expression of a specific polynucleotide. A test compound may be obtained by any method 
commonly known in the art, including chemical modification of a compound known to be effective in 
25 altering polynucleotide expression; selection from an existing, commercially-available or proprietary 
library of naturally-occurring or non-natural chemical compounds; rational design of a compound 
based on chemical and/or structural properties of the target polynucleotide; and selection from a 
library of chemical compounds created combinatorially or randomly. A sample comprising a 
polynucleotide encoding PMMM is exposed to at least one test compound thus obtained. The sample 
30 may comprise, for example, an intact or permeabilized cell, or an in vitro cell-free or reconstituted 
biochemical system. Alterations in the expression of a polynucleotide encoding PMMM are assayed 
by any method commonly known in the art. Typically, the expression of a specific nucleotide is 
detected by hybridization with a probe having a nucleotide sequence complementary to the sequence 
of the polynucleotide encoding PMMM. The amount of hybridization may be quantified, thus 
35 forming the basis for a comparison of the expression of the polynucleotide both with and without 
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expose to one or more test compounds. Detection of a change in the expression of a polynucleotide 
exposed to a test compound indicates that the test compound is effective in altering the expression of 
the polynucleotide. A screen for a compound effective in altering expression of a specific 
polynucleotide can be carried out, for example, using a Schizosaccharomyces pombe gene expression 
5 system (Atkins, D. et al. (1999) U.S. Patent No. 5,932,435; Arndt, G.M. et al. (2000) Nucleic Acids 
Res. 28:E15) or a human cell line such as HeLa cell (Clarke, M.L. et al. (2000) Biochen,. Biophys 
Res. Commun. 268:8-13). A particular embodiment of the present invention involves screening a 
combinatorial library of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide 
nuclexc acids, and modified oligonucleotides) for antisense activity against a specific polynucleotide 
10 sequence (Bruice, T.W. et al. (1997) U.S. Patent No. 5,686,242; Bxuice, T.W. et al. (2000) U.S 
Patent No. 6,022,691). 

Many methods for introducing vectors into cells or tissues are available and equally suitable 
for use in vivo, in vitro, and ex vivo. For ex vivo therapy, vectors may be introduced into stemcells 
taken from the patient and clonally propagated for autologous transplant back into that same patient 
15 Dehvery by translation, by liposome injections, or by polycationic amino polymers may be achieved 
nsmg methods which are well knownin the art (Goldman, C.K. et al. (1997) Nat Biotechnol 15 462- 
466). 

Any of the therapeutic methods described above may be applied to any subject in need of 
such therapy, including, for example, mammals such as humans, dogs, cats, cows, horses, rabbits and 
20 monkeys. 

An additional embodiment of the invention relates to the administration of a composition 
which generally comprises an active ingredient formulated with a pharmaceutical^ acceptable 
excipient. Excipients may include, for example, sugars, starches, celluloses, gums, and proteins 
Various formulations are commonly known and are thoroughly discussed in toe latest edition of 
25 Reming ton^ Pharma^tirnlSn^ (Maack Publishing, Easton PA). Such compositions may 
consist of PMMM, antibodies to PMMM, and mimetics, agonists, antagonists, or inhibitors of 
PMMM. 

In various embodiments, the compositions described herein, such as pharmaceutical 
compositions, may be administered by any number of routes including, but not limited to oral 
30 intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, pulmonary 
transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. 

Compositions for pulmonary administration may be prepared in liquid or dry powder form 
These compositions are generally aerosolized immediately prior to inhalation by the patient In the 
case of small molecules (e.g. traditional low molecular weight organic drugs), aerosol dehvery of 
35 fast-actmgfonnulationsisweU-lcnownintheart In the case of macromolecules (e . g . larger peptides 



75 



WO 03/025131 



PCT/US02/29221 



and proteins), recent developments in the field of pulmonary delivery via the alveolar region of the 
lung have enabled the practical delivery of drugs such as insulin to blood circulation (see, e.g., Patton, 
J.S. et al., U.S. Patent No. 5,997,848). Pulmonary delivery allows administration without needle 
injection, and obviates the need for potentially toxic penetration enhancers. 

5 Compositions suitable for use in the invention include compositions wherein the active 

ingredients are contained in an effective amount to achieve the intended purpose. The determination 
of an effective dose is well within the capability of those skilled in the art. 

Specialized forms of compositions may be prepared for direct intracellular delivery of 
macromolecules comprising PMMM or fragments thereof. For example, liposome preparations 

10 containing a cell-impermeable macromolecule may promote cell fusion and intracellular delivery of 
the macromolecule. Alternatively, PMMM or a fragment thereof may be joined to a short cationic N- 
terminal portion from the HIV Tat-1 protein. Fusion proteins thus generated have been found to 
transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S.R. et 
al. (1999) Science 285:15694572). 

15 For any compound, the therapeutically effective dose can be estimated initially either in cell 

culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, 
monkeys, or pigs. An animal model may also be used to determine the appropriate concentration 
range and route of administration. Such information can then be used to determine useful doses and 
routes for administration in humans. 

20 A therapeutically effective dose refers to that amount of active ingredient, for example 

PMMM or fragments thereof, antibodies of PMMM, and agonists, antagonists or inhibitors of 
PMMM, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be 
determined by standard pharmaceutical procedures in cell cultures or with experimental animals, such 
as by calculating the ED 50 (the dose therapeutically effective in 50% of the population) or LD 50 (the 

25 dose lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 
therapeutic index, which can be expressed as the LD 5 o/ED so ratio. Compositions which exhibit large 
therapeutic indices are preferred. The data obtained from cell culture assays and animal studies are 
used to formulate a range of dosage for human use. The dosage contained in such compositions is 
preferably within a range of circulating concentrations that includes the ED 50 with little or no toxicity. 

30 The dosage varies within this range depending upon the dosage form employed, the sensitivity of the 
patient, and the route of administration. 

The exact dosage will be determined by the practitioner, in light of factors related to the 
subject requiring treatment Dosage and administration are adjusted to provide sufficient levels of the 
active moiety or to maintain the desired effect. Factors which may be taken into account include the 

35 severity of the disease state, the general health of the subject, the age, weight, and gender of the 
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In one aspect, hybridization with PCR probes which are capable of detecting polynucleotides, 
including genomic sequences, encoding PMMM or closely related molecules may be used to identify 
nucleic acid sequences which encode PMMM. The specificity of the probe, whether it is made from 
a highly specific region, e.g., the 5* regulatory region, or from a less specific region, e.g., a conserved 

5 motif, and the stringency of the hybridization or amplification will determine whether the probe 

identifies only naturally occurring sequences encoding PMMM, allelic variants, or related sequences. 

Probes may also be used for the detection of related sequences, and may have at least 50% 
sequence identity to any of the PMMM encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO:32-62 or from 

10 genomic sequences including promoters, enhancers, and introns of the PMMM gene. 

Means for producing specific hybridization probes for polynucleotides encoding PMMM 
include the cloiring of polynucleotides encoding PMMM or PMMM derivatives into vectors for the 
production of mRNA probes. Such vectors are known in the art, are commercially available, and may 
be used to synthesize RNA probes in vitro by means of the addition of the appropriate RNA 

15 polymerases and the appropriate labeled nucleotides. Hybridization probes may be labeled by a 
variety of reporter groups, for example, by radionuclides such as 32 P or 35 S, or by enzymatic labels, 
such as alkaline phosphatase coupled to the probe via avidin/biotin coupling systems, and the like. 

Polynucleotides encoding PMMM may be used for the diagnosis of disorders associated with 
expression of PMMM. Examples of such disorders include, but are not limited to, a gastrointestinal 

20 disorder, such as dysphagia, peptic esophagitis, esophageal spasm, esophageal stricture, esophageal 
carcinoma, dyspepsia, indigestion, gastritis, gastric carcinoma, anorexia, nausea, emesis^ 
gastroparesis, antral or pyloric edema, abdominal angina, pyrosis, gastroenteritis, intestinal 
obstruction, infections of the intestinal tract, peptic ulcer, cholelithiasis, cholecystitis, cholestasis, 
pancreatitis, pancreatic carcinoma, biliary tract disease, hepatitis, hyperbilirubinemia, cirrhosis, 

25 passive congestion of the liver, hepatoma, infectious colitis, ulcerative colitis, ulcerative proctitis, 
Crohn's disease, Whipple's disease, Mallory- Weiss syndrome, colonic carcinoma, colonic 
obstruction, irritable bowel syndrome, short bowel syndrome, diarrhea, constipation, gastrointestinal 
hemorrhage, acquired immunodeficiency syndrome (AIDS) enteropathy, jaundice, hepatic 
encephalopathy, hepatorenal syndrome, hepatic steatosis, hemochromatosis, Wilson's disease, alpha x - 

30 antitrypsin deficiency, Reye's syndrome, primary sclerosing cholangitis, liver infarction, portal vein 
obstruction and thrombosis, centrilobular necrosis, peliosis hepatis, hepatic vein thrombosis, veno- 
occlusive disease, preeclampsia, eclampsia, acute fatty liver of pregnancy, intrahepatic cholestasis of 
pregnancy, and hepatic tumors including nodular hyperplasias, adenomas, and carcinomas; a 
cardiovascular disorder, such as arteriovenous fistula, atherosclerosis, hypertension, vasculitis, 

35 Raynaud's disease, aneurysms, arterial dissections, varicose veins, thrombophlebitis and 
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urticaria, transient acantholytic dermatosis, xerosis, eczema, atopic dermatitis, contact dermatitis, 
hand eczema, nummular eczema, lichen simplex chronicus, asteatotic eczema, stasis dermatitis and 
stasis ulceration, seborrheic dermatitis, psoriasis, lichen planus, pityriasis rosea, impetigo, ecthyma, 
dermatophytosis, tinea versicolor, warts, acne vulgaris, acne rosacea, pemphigus vulgaris, pemphigus 
5 foliaceus, paraneoplastic pemphigus, bullous pemphigoid, herpes gestationis, dermatitis 
herpetiformis, linear IgA disease, epidermolysis bullosa acquisita, dermatomyositis, lupus 
erythematosus, scleroderma and morphea, erythroderma, alopecia, figurate skin lesions, 
telangiectasias, hypopigmentation, hyperpigmentation, vesicles/bullae, exanthems, cutaneous drug 
reactions, papulonodular skin lesions, chronic non-healing wounds, photosensitivity diseases, 

10 epidermolysis bullosa simplex, epidennolytic hyperkeratosis, epidermolytic and nonepidermolytic 

pahnoplantar keratoderma, ichthyosis bullosa of Siemens, ichthyosis exfoliativa, keratosis palmaris et 
plantaris, keratosis palmoplantaris, pahnoplantar keratoderma, keratosis punctata, Meesmann's 
corneal dystrophy, pachyonychia congenita, white sponge nevus, steatocystoma multiplex, epidermal 
nevi/epidermolytic hyperkeratosis type, monilethrix, trichothiodystrophy, chronic 

15 hepatitis/cryptogenic cirrhosis, and colorectal hyperplasia; a neurological disorder, such as epilepsy, 
ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, 
Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic 
lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, retinitis 
pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial and 

20 viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial 
thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases 
including kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome, fatal 
familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, 
tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental 

25 retardation and other developmental disorders of the central nervous system including Down 

syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous system disorders, cranial nerve 
disorders, spinal cord diseases, muscular dystrophy and other neuromuscular disorders, peripheral 
nervous system disorders, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and 
toxic myopathies, myasthenia gravis, periodic paralysis, mental disorders including mood, anxiety, 

30 and schizophrenic disorders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, 
diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, 
Tourette's disorder, progressive supranuclear palsy, corticobasal degeneration, and familial 
frontotemporal dementia; and a reproductive disorder, such as infertility, including tubal disease, 
ovulatory defects, and endometriosis, a disorder of prolactin production, a disruption of the estrous 

35 cycle, a disruption of the menstrual cycle, polycystic ovary syndrome, ovarian hyperstimulation 
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syndrome, an endometrial or ovarian tumor, a uterine fibroid, autoimmune disorders, an ectopic 
pregnancy, and teratogenesis; cancer of me breast, fibrocystic breast disease, and galactorrhea; a 
disruption of spermatogenesis, abnormal sperm physiology, cancer of the testis, cancerofthe ' 
prostate, benign prostatic hyperplasia, prostatitis, Peyronie's disease, impotence, carcinoma of the 
5 male breast, and gynecomastia; an endocrine disorder such as a disorder of the hypothalamus and/or 
pituitary resulting from lesions such as a primary brain tumor, adenoma, infarction associated with 
pregnancy, hypophysectomy, aneurysm, vascular malformation, thrombosis, infection, 
immunological disorder, and complication due to head trauma; a disorder associated with 
hypopituitarism including hypogonadism, Sheehan syndrome, diabetes insipidus, Kallman's disease 
10 Hand-Schuller-Christian disease, Letterer-Siwe disease, sarcoidosis, empty seUa syndrome, and 

dwarfism; a disorder associated with hyperpituitarism including acromegaly, giantism, and syndrome 
of inappropriate antidiuretic hormone (ADH) secretion (SIADH) often caused by benign adenoma- a 
disorder associated with hypothyroidism including goiter, myxedema, acute thyroiditis associated ' 
with bacterial infection, subacute thyroiditis associated with viral infection, autoimmune thyroiditis 
15 (Hashimoto's disease), and cretinism; a disorder associated with hyperthyroidism including 

thyrotoxicosis and its various forms, Grave's disease, pretibial myxedema, toxic multinodular goiter 
thyroid carcinoma, and Plummer's disease; a disorder associated with hyperparathyroidism including 
Conn disease (chronic hypercalemia); a metabolic disorder such as Addison's disease, 
cerebrotendinous xanthomatosis, congenital adrenal hyperplasia, coumarin resistance, cystic fibrosis, 
diabetes, fatty hepatocirrhosis, fructose- ^phosphatase deficiency, galactosemia, goiter, 
glucagonoma, glycogen storage diseases, hereditary fructose intolerance, hyperadrenalism, 
hypoadrenalism, hyperparathyroidism, hypoparathyroidism, hypercholesterolemia, hyperthyroidism, 
hypoglycemia, hypothyroidism, hyperlipemia, hyperlipemia, lipid myopathies, lipodystrophies, 
lysosomal storage diseases, mannosidosis, neurarninidase deficiency, obesity, pentosuria 
phenylketonuria, pseudovitamin D-deficiency rickets; a disorder of carbohydrate metabolism such as 
congenital type II dyserythropoietic anemia, diabetes, insulin-dependent diabetes mellitus, 
non-insulin-dependent diabetes mellitus, fructose-l,6-diphosphatase deficiency, galactosemia, 
glucagonoma, hereditary fructose intolerance, hypoglycemia, mannosidosis, neuraminidase 
deficiency, obesity, galactose epimerase deficiency, glycogen storage diseases, lysosomal storage 
diseases, fructosuria, pentosuria, and inherited abnormahties of pyruvate metabolism; a disorder of 
lipid metabolism such as fatty fiver, cholestasis, primary biliary cirrhosis, carnitine deficiency 
carnitine palmitoyltransferase deficiency, myoadenylate deaminase deficiency, hypertriglyceridemia, 
hpid storage disorders such Fabry's disease, Gaucher's disease, Niemann-Pick's disease, 
metachromatic leukodystrophy, adrenoleukodystrophy, GM 2 gangliosidosis, and ceroid 
hpofuscinosis, abetaHpoproteinemia, Tangier disease, hyperlipoproteinemia, diabetes mellitus, 
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lipodystrophy, lipomatoses, acute panniculitis, disseminated fat necrosis, adiposis dolorosa, lipoid 
adrenal hyperplasia, minimal change disease, lipomas, atherosclerosis, hypercholesterolemia, 
hypercholesterolemia with hypertriglyceridemia, primary hypoalphalipoproteinemia, hypothyroidism, 
renal disease, liver disease, lecithinrcholesterol acyltransferase deficiency, cerebrotendinous 

5 xanthomatosis, sitosterolemia, hypocholesterolemia, Tay-Sachs disease, Sandhoff s disease, 

hyperlipidemia, hyperlipemia, lipid myopathies, and obesity; and a disorder of copper metabolism 
such as Menke's disease, Wilson's disease, and Ehlers-Danlos syndrome type IX; a pancreatic 
disorder such as Type I or Type II diabetes mellitus and associated complications; a disorder 
associated with the adrenals such as hyperplasia, carcinoma, or adenoma of the adrenal cortex, 

10 hypertension associated with alkalosis, amyloidosis, hypokalemia, Gushing' s disease, Liddle's 

syndrome, and Arnold-Healy-Gordon syndrome, pheochromocytoma tumors, and Addison's disease; 
a disorder associated with gonadal steroid hormones such as: in women, abnormal prolactin 
production, infertility, endometriosis, perturbation of the menstrual cycle, polycystic ovarian disease, 
hyperprolactinemia, isolated gonadotropin deficiency, amenorrhea, galactorrhea, hermaphroditism, 

15 hirsutism and virilization, breast cancer, and, in post-menopausal women, osteoporosis; and, in men, 
Leydig cell deficiency, male climacteric phase, and germinal cell aplasia, a hypergonadal disorder 
associated with Leydig cell tumors, androgen resistance associated with absence of androgen 
receptors, syndrome of 5 a -reductase, and gynecomastia; a cancer such as adenocarcinoma, leukemia, 
lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal 

20 gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, 
heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, 
spleen, testis, thymus, thyroid, and uterus; and an infection caused by a viral agent classified as 
adenovirus, arenavirus, bunyavirus, calicivirus, coronavirus, filovirus, hepadnavirus, herpesvirus, 
flavivirus, orthomyxovirus, parvovirus, papovavirus, paramyxovirus, picomavirus, poxvirus, 

25 reovirus, retrovirus, rhabdovirus, or togavirus; an infection caused by a bacterial agent classified as 
pneumococcus, staphylococcus, streptococcus, bacillus, corynebacterium, Clostridium, 
meningococcus, gonococcus, listeria, moraxella, kingella, haemophilus, legionella, bordetella, gram- 
negative enterobacterium including shigella, salmonella, or Campylobacter, pseudomonas, vibrio, 
brucella, francisella, yersinia, bartonella, norcardium, actinomyces, mycobacterium, spirochaetale, 

30 rickettsia, chlamydia, or mycoplasma; an infection caused by a fungal agent classified as aspergillus, 
blastomyces, dermatophytes, cryptococcus, coccidioides, malasezzia, histoplasma, or other mycosis- 
causing fungal agent; and an infection caused by a parasite classified as plasmodium or malaria- 
causing, parasitic entamoeba, leishmania, trypanosoma, toxoplasma, Pneumocystis carinii, intestinal 
protozoa such as giardia, trichomonas, tissue nematode such as trichinella, intestinal nematode such 

35 as ascaris, lymphatic filarial nematode, trematode such as schistosoma, and cestrode such as 
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tapeworm. Polynucleotides encoding PMMM may be used in Southern or northern analysis, dot blot 
or other membrane-based technologies; in PGR technologies; in dipstick, pin, and multifonnat 
ELISA-like assays; and in microarrays utilizing fluids or tissues frompatients to detect altered 
PMMM expression. Such qualitative or quantitative methods are well known in the art. 
5 In a particular embodiment, polynucleotides encoding PMMM may be used in assays that 

detect the presence of associated disorders, particularly those mentioned above. Polynucleotides 
complementary to sequences encoding PMMM may be labeled by standard methods and added to a 
fluid or fissue sample from a patient under conditions suitable for the formation of hybridization 
complexes. After a suitable incubation period, me sample is washed and the signal is quantified and 
10 compared with a standard value. If the amount of signal in the patient sample is significantly altered 
m companson to a control sample then the presence of altered levels of polynucleotides encoding 
PMMM m the sample indicates the presence of me associated disonler. Such assays may also be 
used to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies, in 
clinical trials, or to monitor the treatment of an individual patient. 
15 In order to provide a basis for the diagnosis of a disorder associated with expression of 

PMMM, anormal or standard profile for expressionis established. This may be accomplished by 
combining body fluids or cell extracts taken from normal subjects, either animal or human with a 
sequence, or a fragment thereof, encoding PMMM under conditions suitable for hybridization or 
amplication. Standard hybridization may be quantified by comparing the values obtained from 
20 normal subjects with values from an experiment in which a known amount of a substantially purified 
polynucleotide is used. Standard values obtained in this manner may be compared with values 
obtained from samples from patients who are symptomatic for a disorder. Deviation from standard 
values is used to establish the presence of a disorder. 

Once the presence of a disorder is established and a treatment protocol is initiated 
25 hybndization assays may be repeated on a regular basis to determine if the level of expression in the 
patient begins to approximate that which is observed in the normal subject. The results obtained from 
successive assays may be used to show the efficacy of treatment over a period ranging from several 
days to months. 

With respect to cancer, the presence of an abnormal amount of transcript (either under- or 
30 overexpressed) in biopsied tissue from an individual may indicate a predisposition for the 
developn^t of foe disease^ 

of actual clinical symptoms. A more definitive diagnosis of this type may allow health professionals 
to employ preventative measures or aggressive treatment earlier, thereby preventing foe development 
or further progression of the cancer. 
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Additional diagnostic uses for oligonucleotides designed from the sequences encoding 
PMMM may involve the use of PCR. These oligomers may be chemically synthesized, generated 
enzymatically, or produced in vitro. Oligomers will preferably contain a fragment of a polynucleotide 
encoding PMMM, or a fragment of a polynucleotide complementary to the polynucleotide encoding 

5 PMMM, and will be employed under optimized conditions for identification of a specific gene or 
condition. Oligomers may also be employed under less stringent conditions for detection or 
quantification of closely related DNA or UNA sequences. 

In a particular aspect, oligonucleotide primers derived from polynucleotides encoding 
PMMM may be used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, 

10 insertions and deletions that are a frequent cause of inherited or acquired genetic disease in humans. 
Methods of SNP detection include, but are not limited to, single-stranded conformation 
polymorphism (SSCP) and fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers 
derived from polynucleotides encoding PMMM are used to amplify DNA using the polymerase chain 
reaction (PCR). The DNA may be derived, for example, from diseased or normal tissue, biopsy 

15 sauries, bodily fluids, and the like. SNPs in the DNA cause differences in the secondary and tertiary 
structures of PCR products in single-stranded form, and these differences are detectable using gel 
electrophoresis in non-denaturing gels. In fSCCP, the oligonucleotide primers are fluorescently 
labeled, which allows detection of the amplimers in high-throughput equipment such as DNA 
sequencing machines. Additionally, sequence database analysis methods, termed in silico SNP 

20 (isSNP), are capable of identifying polymorphisms by comparing the sequence of individual 

overlapping DNA fragments which assemble into a common consensus sequence. These computer- 
based methods filter out sequence variations due to laboratory preparation of DNA and sequencing 
errors using statistical models and automated analyses of DNA sequence chromatograms. In the 
alternative, SNPs may be detected and characterized by mass spectrometry using, for example, the 

25 high throughput MASS ARRAY system (Sequenom, Inc., San Diego CA). 

SNPs may be used to study the genetic basis of human disease. For example, at least 1 6 
common SNPs have been associated with non-insulin-dependent diabetes mellitus. SNPs are also 
useful for examining differences in disease outcomes in monogenic disorders, such as cystic fibrosis, 
sickle cell anemia, or chronic granulomatous disease. For example, variants in the mannose-binding 

30 lectin, MBL2, have been shown to be correlated with deleterious pulmonary outcomes in cystic 
fibrosis. SNPs also have utility in pharmacogenomics, the identification of genetic variants that 
influence a patient's response to a drug, such as life-threatening toxicity. For example, a variation in 
N-acetyl transferase is associated with a high incidence of peripheral neuropathy in response to the 
anti-tuberculosis drug isoniazid, while a variation in the core promoter of the ALOX5 gene results in 

35 diminished clinical response to treatment with an anti-asthma drug that targets the 5-lipoxygenase 
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pathway. Analysis of the distribution of SNPs in different populations is useful for investigating 
genetic drift, mutation, recombination, and selection, as well as for tracing the origins of populations 
and then- migrations (Taylor, J.G. et al. (2001) Trends Mol. Med. 7:507-512; Kwok, P,Y. and Z Gu 
(1999) Mol. Med. Today 5:538-543; Nowotny, P. et al. (2001) Curr. Opin. Neurobiol. 11:637-641) 
5 Methods which may also be used to quantify the expression of PMMM include radiolabeling 

or brotmylating nucleotides, coamplification of a control nucleic acid, and interpolating results from 
standard curves (Melby, P.C. et al. (1993) J. Immunol. Methods 159:235-244; Duplaa, C et al (1993) 
Anal. Biochem 212:229-236). The speed of quantitation of multiple samples may be accelerated by 
running the assay in a high-throughput format where the oligomer or polynucleotide of interest is 
10 presented in various dilutions and a spectrophotometry or colorimetric response gives rapid 
quantitation. 

In further embodiments, oligonucleotides or longer fragments derived from any of the 
polynucleotides described herein may be used as elements on a microarray. Themicroarray canbe 
used m transcript imaging techniques which monitor the relative expression levels of large numbers 

15 ofgenes simultaneously as described below. The microarray may also be used to identify genetic 
vamnts, mutations, and polymorphisms. This information may be used to determine gene function, 
to understand the genetic basis of a disorder, to diagnose a disorder, to monitor 
progression/regression of disease as a function of gene expression, and to develop and monitor the 
acuvmes of therapeutic agents in the treatment of disease. In particular, this information may be used 

20 develop a pharmacogenomicp^ 

effective treatment regimen for that patient. For example, therapeutic agents which are highly 
effective and display the fewest side effects may be selected for a patient based on his/her 



In another embodiment, PMMM, fragments of PMMM, or antibodies specific for PMMM 
25 maybeusedaselementsonamicroarray. The microarray may be used to monitor or measure 
protein-protein interactions, drug-target interactions, and gene expression profiles, as described 
above. 

A particular embodiment relates to the use of the polynucleotides of the present invention to 
generate a transcript image of a tissue or cell type. A transcript image represents the global pattern of 

30 geneexpressionbyaparticulartissueorcelltype. Global gene expression patterns are analyzed by 
quantnymg the number of expressed genes and their relative abundance under given conditions and at 
a given tune (Seflhamer et al., "Comparative Gene Transcript Analysis," U.S. Patent No. 5,840 484- 
hereby expressly incorporated by reference herein). Thus a transcript image may be generated by ' 
hybndazmg the polynucleotides of the present invention or their complements to the totality of 

35 transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the 
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hybridization takes place in high-throughput format, wherein the polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. The 
resultant transcript image would provide a profile of gene activity. 

Transcript images may be generated using transcripts isolated from tissues, cell lines, 

5 biopsies, or other biological samples. The transcript image may thus reflect gene expression in vivo, 
as in the case of a tissue or biopsy sample, or in vitro, as in the case of a cell line. 

Transcript images which profile the expression of the polynucleotides of the present 
invention may also be used in conjunction with in vitro model systems and preclinical evaluation of 
pharmaceuticals, as well as toxicological testing of industrial and naturaHy-occurring environmental 

10 compounds. All compounds induce characteristic gene expression patterns, frequently termed 
molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and 
toxicity (Nuwaysir, E.F. et al. (1999) Mol. Carcinog. 24:153-159; Steiner, S. and N.L. Anderson 
(2000) Toxicol. Lett. 112-113:467-471). If a test compound has a signature similar to that of a 
compound with known toxicity, it is likely to share those toxic properties. These fingerprints or 

15 signatures are most useful and refined when they contain expression information from a large number 
of genes and gene families. Ideally, a genome-wide measurement of expression provides the highest 
quality signature. Even genes whose expression is not altered by any tested compounds are important 
as well, as the levels of expression of these genes are used to normalize the rest of the expression 
data. The normalization procedure is useful for comparison of expression data after treatment with 

20 different compounds. While the assignment of gene function to elements of a toxicant signature aids 
in interpretation of toxicity mechanisms, knowledge of gene function is not necessary for the 
statistical matching of signatures which leads to prediction of toxicity (see, for example, Press 
Release 00-02 from the National Institute of Environmental Health Sciences, released February 29, 
2000, available at http://www.niehs.nih.gov/oc/news/toxcHp.htm). Therefore, it is important and 

25 desirable in toxicological screening using toxicant signatures to include all expressed gene sequences. 

In an embodiment, the toxicity of a test compound can be assessed by treating a biological 
sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the 
treated biological sample are hybridized with one or more probes specific to the polynucleotides of 
the present invention, so that transcript levels corresponding to the polynucleotides of the present 

30 invention may be quantified. The transcript levels in the treated biological sample are compared with 
levels in an untreated biological sample. Differences in the transcript levels between the two samples 
are indicative of a toxic response caused by the test compound in the treated sample. 

Another embodiment relates to the use of the polypeptides disclosed herein to analyze the 
proteome of a tissue or cell type. The term proteome refers to the global pattern of protein expression 

35 in a particular tissue or cell type. Each protein component of a proteome can be subjected 
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individual to fate analysis. Proteome expression patterns, or profiles, are analyzed by 
quantifying the number of expressed proteins and their relative abundance under given conditions and 
atagrventme. A profile of a cell's proteome may thus be generated by separating and analyzing the 
polypeptides of a particular tissue or cell type. In one embodiment, the separation is achieved usin. 
5 two-dunensional gel electrophoresis, in which proteins from a sample are separated by isoelectric " 
focusang in the first dimension, and then according to molecular weight by sodium dodecyl sulfate 
slab gel electrophoresis in the second dimension (Steiner and Anderson, supra). The proteins are 
v 1S uahzed in me gel as discrete and uniquely positioned spots, typically by staining the gel with an 
agent such as Coomassie Blue or silver or fluorescent stains. The optical density of each protein spot 
) ^generauypmportionaltomelevelofmeproteminthesample. The optical densities of 

equivaJenfly positioned protein spots fromdifferent samples, for example, from biological samples 
either treated or untreated with a test compound or therapeutic agent, are compared to identify any 
changes in protein spot density related to the treatment The proteins in the spots are partially 
sequenced using, for example, standard methods employing chemical or enzymatic cleavage followed 
by mass spectrometry. The identity of the protein in a spot may be determined by comparing its 
partial sequence, preferably of at least 5 contiguous amino acid residues, to the polypeptide sequences 
of interest In some cases> ^ ^ ^ ^ ^ ^ 

A proteomic profile may also be generated using antibodies specific for PMMM to quantify 
thelevelsofPMMMexpression. In one embodiment me antibodies are used as elements on a 
mtcroarray, and protein expression levels are quantified by exposing the microarray to the sample and 
detecfag the levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem 
270:103-111; Mendoze, L.G. et al. (1999) Biotechniques 27:778-788). Detection may be performed 
by a variety of methods knowninthe art, for example, by reacting the proteins in the sample with a 
thiol- or ammo-reactive fluorescent compound and detecting the amount of fluorescence bound at 
each array element. 

Toxicant signatures at the proteome level are also useful for topological screening and 
should be analyzed in parallel with toxicant signatures at the transcript level. There is a poor 
correlation between transcript and protein abundances for some proteins in some tissues (Anderson, 
N.L. and J. Seilhamer (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be 
useful in the analysis of compounds which do not significantly affect the transcript image, but which 
alter die proteomic profile. In addition, the analysis of transcripts inbody fluids is difficult due to 
rapid degradation of mRNA, so proteomic profiling may be more rehable and informative in such 



cases 



In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test compound. Proteins that are expressed in the treated 
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biological sample are separated so that the amount of each protein can be quantified. The amount of 
each protein is compared to the amount of the corresponding protein in an untreated biological 
sample. A difference in the amount of protein between the two samples is indicative of a toxic 
response to the test compound in the treated sample. Individual proteins are identified by sequencing 

5 the amino acid residues of the individual groteins and comparing these partial sequences to the 
polypeptides of the present invention. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test compound. Proteins from the biological sample are 
incubated with antibodies specific to the polypeptides of the present invention. The amount of 

10 protein recognized by the antibodies is quantified. The amount of protein in the treated biological 
sample is compared with the amount in an untreated biological sample. A difference in the amount of 
protein between the two samples is indicative of a toxic response to the test compound in the treated 
sample. 

Microarrays may be prepared, used, and analyzed using methods known in the art (Brennan, 

15 T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. USA 
93:10614-10619; Baldeschweiler et al. (1995) PCT application W095/251 116; Shalon, D. et al. 
(1995) PCT application WO95/35505; Heller, R.A. et al. (1997) Proc. Natl. Acad Sci. USA 94:2150- 
2155; Heller, M.J. et al. (1997) U.S. Patent No. 5,605,662). Various types of microarrays are well 
known and thoroughly described in Schena, M., ed. (1999; DNA Microarrays: A Practical Approach , 

20 Oxford University Press, London). 

In another embodiment of the invention, nucleic acid sequences encoding PMMM may be 
used to generate hybridization probes useful in mapping the naturally occurring genomic sequence. 
Either coding or noncoding sequences may be used, and in some instances, noncoding sequences may 
be preferable over coding sequences. For example, conservation of a coding sequence among 

25 members of a multi-gene family may potentially cause undesired cross hybridization during 

chromosomal mapping. The sequences may be mapped to a particular chromosome, to a specific 
region of a chromosome, or to artificial chromosome constructions, e.g., human artificial 
chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes 
(BACs), bacterial PI constructions, or single chromosome cDN A libraries (Harrington, J. J. et al. 

30 (1997) Nat Genet. 15:345-355; Price, CM. (1993) Blood Rev. 7:127-134; Trask, B J. (1991) Trends 
Genet 7:149-154). Once mapped, the nucleic acid sequences may be used to develop genetic linkage 
maps, for example, which correlate the inheritance of a disease state with the inheritance of a 
particular chromosome region or restriction fragment length polymorphism (RFLP) (Lander, E.S. and 
D. Botstein(1986) Proc. Natl. Acad. Sci. USA 83:7353-7357). 
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Fluorescent in situ hybridization (FISH) may be correlated with other physical and genetic 
map data (Heinz-UMch, et al. (1995) in Meyers, supra, pp. 965-968). Examples of genetic map data 
can be found in various scientific journals or at me Online Mendelian Inheritance in Man (OMIM) 
World Wide Web site. Correlation between me location of the gene encoding PMMM on a physical 
5 map and a specific disorder, or a predisposition to a specific disorder, may help define the region of 
DNA associated with that disorder and thus may further positional cloning efforts. 

In situ hybridization of chromosomal preparations and physical mapping techniques, such as 
hnkage analysis using established chromosomal markers, may be used for extending genetic maps 
Often the placement of a gene on the chromosome of another mammalian species, such as mouse 
10 -y-vealassociatedmar^ This information is 

valuable to investigators searching for disease genes using positional cloning or other gene discovery 
techniques. Once the gene or genes responsible for a disease or syndrome have been crudely 
localized by genetic linkage to a particular genomic region, e.g., ataxia-telangiectasia to 1 lq22-23 
any sequences mapping to that area may represent associated or regulatory genes for further 
15 mvestigation(Gatti,R.A.etal. (1988) Nature 336:577-580). The nucleotide sequence of the instant 
invention may also be used to detect differences in the chromosomal location due to translocation, 
inversion, etc., among normal, carrier, or affected individuals. 

In another embodiment of the invention, PMMM, its catalytic or immunogenic fragments or 
oligopeptides thereof can be used for screening libraries of compounds in any of a variety of drug 
20 screening techniques. The fragment employed in such screening may be free in solution, affixedtoa 
solid support, borne on a cell surface, or located intracellularly. The formation of binding complexes 
between PMMM and the agent being tested may be measured. 

Another technique for drug screening provides for high throughput screening of compounds 
having suitable binding affinity to the protein of interest (Geysen, et al. (1984) PCT application 
25 WO84/03564). In this method, large numbers of different small test compounds are synthesized on a 
sohd substrate. The test compounds are reacted with PMMM, or fragments thereof, and washed. 
Bound PMMM is then detected by methods weU known in the art. Purified PMMM can also be 
coated directly onto plates for use in the aforementioned drug screening techniques. Alternatively 
non-neutralizing antibodies can be used to capture the peptide and immobilize it on a sohd support. 
30 In another embodiment, one may use competitive drug screening assays in which neutralizing 

antibodies capable of binding PMMM specifically compete with a test compound for bindin* 
PMMM. In this manner, antibodies can be used to detect the presence of any peptide which shares 
one or more antigenic determinants with PMMM. 

In additional embodiments, the nucleotide sequences which encode PMMM may be used in 
35 any molecular biology techniques that have yet to be developed, provided file new techniques rely on 
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properties of nucleotide sequences that are currently known, including, but not limited to, such 

properties as the triplet genetic code and specific base pair interactions. 

Without further elaboration, it is believed that one skilled in the art can, using the preceding 

description, utilize the present invention to its fullest extent. The following embodiments are, 
5 therefore, to be construed as merely illustrative, and not limitative of the remainder of the disclosure 

in any way whatsoever. 

The disclosures of all patents, applications, and publications mentioned above and below, 

including U.S. Ser. No. 60/322,196, U.S. Ser. No. 60/324,134, U.S. Ser. No. 60/327,233, U.S. Ser. 

No. 60/332,423, U.S. Ser. No. 60/334,145, U.S. Ser. No. 60/334,229, U.S. Ser. No. 60/337,451, U.S. 
10 Ser. No. 60/343,980, U.S. Ser. No. 60/346,198, U.S. Ser. No. 60/348,887, U.S. Ser. No. 60/351,928, 

and U.S. Ser. No. 60/366,837, are hereby expressly incorporated by reference. 

EXAMPLES 

I. Construction of cDNA Libraries 

15 Incyte cDNAs were derived from cDNA libraries described in the LEFESEQ GOLD database 

(Incyte Genomics, Palo Alto CA). Some tissues were homogenized and lysed in guanidinium 
isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixture of 
denaturants, such as TREZOL (Invitrogen), a monophasic. solution of phenol and guanidine 
isothiocyanate. The resulting lysates were centrifuged over CsCl cushions or extracted with 

20 chloroform RNA was precipitated from the lysates with either isopropanol or sodium acetate and 
ethanol, or by other routine methods. 

Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA 
purity. In some cases, RNA was treated with DNase. For most libraries, poly(A)+ RNA was isolated 
using oligo d(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles (QIAGEN, 

25 Chatsworth CA), or an OLIGOTEX mRNA purification kit (QIAGEN). Alternatively, RNA was 
isolated directly from tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA 
purification kit (Ambion, Austin TX). 

In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA 
libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP 

30 vector system (Stratagene) or SUPERSCRIPT plasmid system (Invitrogen), using the recommended 
procedures or similar methods known in the art (Ausubel et al., supra, ch. 5). Reverse transcription 
was initiated using oligo d(T) or random primers. Synthetic oligonucleotide adapters were ligated to 
double stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or 
enzymes. For most libraries, the cDNA was size-selected (300-1000 bp) using SEPHACRYL S 1000, 

35 SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (Amersham Biosciences) or 
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preparative agarose gel electrophoresis. cDNAs were ligated into compatible restriction enzyme sites 
of the polylinker of a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPORT1 plasmid 
(Invitrogen, Carlsbad CA), PCDNA2.1 plasmid (Invitrogen), PBK-CMV plasmid (Stratagene), PCR2- 
TOPOTA plasmid (Invitrogen), PCMV-ICIS plasmid (Stratagene), pIGEN (Incvte Genomics, Palo 
Alto CA), pRARE (Incyte Genomics), or pINCY (Incyte Genomics), or derivatives thereof. 
Recombinant plasmids were transformed into competent E. coli cells including XLl-Blue, XL1- 
BlueMRF, or SOLR from Stratagene or DH5a, DH10B, or ElectroMAX DH10B from Invitrogen. 
H. Isolation of cDNA Clones 

Plasmids obtained as described in Example I were recovered fromhost cells by in vivo 
excision using the UNEAP vector system (Stratagene) or by cell lysis. Plasmids were purified using 
at least one of the following: a Magic or WIZARD Minipreps DNA purification system (Promega); an 
AGTC Miniprep purification kit (Edge Biosystems, Gaithersburg MD); and QIAWELL 8 Plasmid, 
QIAWELL 8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.E.A.L. PREP 96 
plasmid purification kit from QIAGEN. Following precipitation, plasmids were resuspended in 0. 1 
15 ml of distilled water and stored, with or without lyophilization, at 4°C. 

Alternatively, plasmid DNA was amplified fromhost cell lysates using direct link PCR in a 
high-throughput format (Rao, V.B. (1994) Anal. Biochem 216:1-14). Host cell lysis and thermal 
cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 
384-well plates, and the concentration of amplified plasmid DNA was quantified fMorometrically 
20 using PICOGREEN dye (Molecular Probes, Eugene OR) and a FLUOROSKAN H fluorescence 
scanner (Labsystems Oy, Helsinki, Finland). 
HI. Sequencing and Analysis 

Incvte cDNA recovered in plasmids as described in Example H were sequenced as follows. 
Sequencing reactions were processed using standard methods or high-throughput instrumentation 
25 such as the ABI CATALYST 800 (Applied Biosystems) thermal cycler or the PTC-200 thermal 
cycler (MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific) or the 
MICROLAB 2200 (Hamilton) liquid transfer system cDNA sequencing reactions were prepared 
using reagents provided by Amersham Biosciences or supplied in ABI sequencing kits such as the 
ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 
30 Hectrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides 
were carried out using the MEGAB ACE 1 000 DNA sequencing system (Amersham Biosciences) ; the 
ABI PRISM 373 or 377 sequencing system (Applied Biosystems) in conjunction with standard ABI 
protocols and base calling software; or other sequence analysis systems known in the art Reading 
frames within the cDNA sequences were identified using standard methods (Ausubel et al. , supra, ch. 
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7). Some of the cDNA sequences were selected for extension using the techniques disclosed in 
Example VIE. 

The polynucleotide sequences derived from Incyte cDNAs were validated by removing 
vector, linker, and poly(A) sequences and by masking ambiguous bases, using algorithms and 

5 programs based on BLAST, dynamic programming, and dinucleotide nearest neighbor analysis. The 
Incyte cDNA sequences or translations thereof were then queried against a selection of public 
databases such as the GenBahk primate, rodent, mammalian, vertebrate, and eukaryote databases, and 
BLOCKS, PRINTS, DOMO, PRODOM; PROTEOME databases with sequences fromHomo sapiens, 
Rattus noivegicus, Mus musculus, Caenorhabditis elegans, Saccharowyces cerevisiae, 

10 Schizosaccharomyces pombe, and Candida albicans (Incyte Genomics, Palo Alto CA); hidden 

Markov model (HMM)-based protein family databases such as PFAM, INCY, and TIGRFAM (Haft, 
D.H. et al. (2001) Nucleic Acids Res. 29:41-43); and HMM-based protein domain databases such as 
SMART (Schultz, J. et al. (1998) Proc. Natl. Acad. Sci. USA 95:5857-5864; Letunic, I. et al. (2002) 
Nucleic Acids Res. 30:242-244). (HMM is a probabilistic approach, which analyzes consensus 

15 primary structures of gene families; see, for example, Eddy, S.R. (1996) Curr. Opin. Struct. Biol. 
6:361-365.) The queries were performed using programs based on BLAST, FASTA, BLIMPS, and 
HMMER. The Incyte cDNA sequences were assembled to produce full length polynucleotide 
sequences. Alternatively, GenBank cDNAs, GenBank ESTs, stitched sequences, stretched sequences, 
or Genscan-predicted coding sequences (see Examples IV and V) were used to extend Incyte cDNA 

20 assemblages to full length. Assembly was performed using programs based on Phred, Phrap, and 
Consed, and cDNA assemblages were screened for open reading frames using programs based on 
GeneMark, BLAST, and FASTA The full length polynucleotide sequences were translated to derive 
the corresponding full length polypeptide sequences. Alternatively, a polypeptide may begin at any 
of the methionine residues of the full length translated polypeptide. Full length polypeptide 

25 sequences were subsequently analyzed by querying against databases such as the GenBank protein 

databases (genpept), SwissProt, the PROTEOME databases, BLOCKS, PRINTS, DOMO, PRODOM, 
Prosite, hidden Markov model (HMM)-based protein family databases such as PFAM, INCY, and 
TIGRFAM; and HMM-based protein domain databases such as SMART. Full length polynucleotide 
sequences are also analyzed using MACDNASIS PRO software (MiraiBio, Alameda CA) and 

30 LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence alignments are 
generated using default parameters specified by the CLUSTAL algorithm as incorporated into the 
MEGALIGN multisequence alignment program (DNASTAR), which also calculates the percent 
identity between aligned sequences. 

Table 7 summarizes the tools, programs, and algorithms used for the analysis and assembly of 

35 Incyte cDNA and full length sequences and provides applicable descriptions, references, and 
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threshold parameters. The first column of Table 7 shows the tools, programs, and algorithms used, 
the second column provides brief descriptions thereof, the third column presents appropriate 
references, all of which are incorporated by reference herein in their entirety, and me fourth column 
presents, where applicable, the scores, probability values, and other parameters used to evaluate the 
5 strength of a match between two sequences (the higher the score or the lower the probability value, 
the greater the identity between two sequences). 

The programs described above for the assembly and analysis of full length polynucleotide 
and polypeptide sequences were also used to identify polynucleotide sequence fragments from SEQ 
ID NO:32-62. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization 
10 and amplification technologies are described in Table 4, column 2. 

IV. Identification and Editing of Coding Sequences from Genomic DNA 

Putative protein modification and maintenance molecules were initially identified by running 
the Genscan gene identification program against public genomic sequence databases (e.g., gbpri and 
gbhtg). Genscan is a general-purpose gene identification program which analyzes genomic DNA 
15 sequences froma variety of organisms (Burge, C. and S. Karlin (1997) J. Mol. Biol. 268:78-94; 
Burge, C. and S. Karlin (1998) Curr. Opin. Struct Biol. 8:346-354). The program concatenates 
predicted exons to form an assembled cDNA sequence extending from a methionine to a stop codon. 
The output of Genscan is a PASTA database of polynucleotide and polypeptide sequences. The 
niaximumrange of sequence for Genscan to analyze at once was set to 30 kb. To determine which of 
20 tnese Genscan predicted cDNA sequences encode protein modification and maintenance molecules, 
the encoded polypeptides were analyzed by querying against PFAM models for protein modification 
and maintenance molecules. Potential protein modification and maintenance molecules were also 
identified by homology to Incyte cDNA sequences that had been annotated as protein modification 
and maintenance molecules. These selected Genscan-predicted sequences were then compared by 
25 BLAST analysis to the genpept and gbpri public databases. Where necessary, the Genscan-predicted 
sequences were then edited by comparison to the top BLAST hit from genpept to correct errors in the 
sequence predicted by Genscan, such as extra or omitted exons. BLAST analysis was also used to 
find any Incyte cDNA or public cDNA coverage of the Genscan-predicted sequences, thus providing 
evidence for transcription. When Incyte cDNA coverage was available, this information was used to 
30 correct or confirm the Genscan predicted sequence. Full length polynucleotide sequences were 
obtained by assembling Genscan-predicted coding sequences with Incyte cDNA sequences and/or 
public cDNA sequences using the assembly process described in Example HI. Alternatively, full 
length polynucleotide sequences were derived entirely from edited or unedited Genscan-predicted 
coding sequences. 

35 V. Assembly of Genomic Sequence Data with cDNA Sequence Data 
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"Stitched" Sequences 

Partial cDNA sequences were extended with exons predicted by the Genscan gene 
identification program described in Example IV. Partial cDNAs assembled as described in Example 
HI were mapped to genomic DNA and parsed into clusters containing related cDNAs and Genscan 

5 exon predictions from one or more genomic sequences. Each cluster was analyzed using an algorithm 
based on graph theory and dynamic programming to integrate cDNA and genomic information, 
generating possible splice variants that were subsequently confirmed, edited, or extended to create a 
full length sequence. Sequence intervals in which the entire length of the interval was present on 
more than one sequence in the cluster were identified, and intervals thus identified were considered to 

10 he equivalent by transitivity. For example, if an interval was present on a cDNA and two genomic 
sequences, then all three intervals were considered to be equivalent. This process allows unrelated 
hut consecutive genomic sequences to be brought together, bridged by cDNA sequence. Intervals 
thus identified were then "stitched" together by the stitching algorithm in the order that they appear 
along their parent sequences to generate the longest possible sequence, as well as sequence variants. 

15 Linkages between intervals which proceed along one type of parent sequence (cDNA to cDNA or 
genomic sequence to genomic sequence) were given preference over linkages which change parent 
type (cDNA to genomic sequence). The resultant stitched sequences were translated and compared 
by BLAST analysis to the genpept and gbpri public databases. Incorrect exons predicted by Genscan 
were corrected by comparison to the top BLAST hit from genpept. Sequences were further extended 

20 with additional cDNA sequences, or by inspection of genomic DNA, when necessary. 
"Stretched" Sequences 

Partial DNA sequences were extended to full length with an algorithm based on BLAST 
analysis. First, partial cDNAs assembled as described in Example m were queried against public 
databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases 

25 using the BLAST program The nearest GenBank protein homolog was then compared by BLAST 
analysis to either Incyte cDNA sequences or GenScan exon predicted sequences described in 
Example IV. A chimeric protein was generated by using the resultant high-scoring segment pairs 
(HSPs) to map the translated sequences onto the GenBank protein homolog. Insertions or deletions 
may occur in the chimeric protein with respect to the original GenBank protein homolog. The 

30 GenBank protein homolog, the chimeric protein, or both were used as probes to search for 

homologous genomic sequences from the public human genome databases. Partial DNA sequences 
were therefore "stretched" or extended by the addition of homologous genomic sequences. The 
resultant stretched sequences were examined to determine whether it contained a complete gene. 
VI. Chromosomal Mapping of PMMM Encoding Polynucleotides 
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Hie sequences which were used to assemble SEQ ID NO:32-62 were compared with 
sequences from the Incyte LDFESEQ database and public domain databases using BLAST and other 
implementations of the Smith-Waterman algorithm. Sequences from these databases that matched 
SEQ ID NO:32-62 were assembled into clusters of contiguous and overlapping sequences using 
assembly algorithms such as Phrap (Table 7). Radiation hybrid and genetic mapping data available 
from public resources such as the Stanford Human Genome Center (SHGC), Whitehead Institute for 
Genome Research (WIGR), and G6n<Sthon were used to determine if any of the clustered sequences 
had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in the assignment 
of all sequences of that cluster, including its particular SEQ ID NO:, to that map location. 

Map locations are represented by ranges, or intervals, of human chromosomes. The map 
position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p- 
arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between 
chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 
humans, although this can vary widely due to hot and cold spots of recombination.) The cM 
distances are based on genetic markers mapped by Genethon which provide boundaries for radiation 
hybrid markers whose sequences were included in each of the clusters. Human genome maps and 
other resources available to the public, such as the NCBI "GeneMap'99" World Wide Web site 
(http://www.ncbi.nlm.nili.gov/genemap/), can be employed to determine if previously identified 
disease genes map within or in proximity to the intervals indicated above. 
VIL Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a 
gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
from a particular cell type or tissue have been bound (Sambrook and Russell, supra, ch. 7; Ausubel et 
al., supra, ch. 4). 

Analogous computer techniques applying BLAST were used to search for identical or related 
molecules in databases such as GenBank or LIFESEQ (Incyte Genomics). This analysis is much 
faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer 
search can be modified to determine whether any particular match is categorized as exact or similar. 
The basis of the search is the product score, which is defined as: 

BLAST Score x Percent Identity 
5 x minimum {length(Seq. 1), length(Seq. 2)} 

The product score takes into account both the degree of similarity between two sequences and the 
length of the sequence match. The product score is a normalized value between 0 and 100, and is 
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calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the 
product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is 
calculated by assigning a score of +5 for every base lhat matches in a high-scoring segment pair 
(HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by 
gaps). If there is more than one HSP, Ihen the pair with the highest BLAST score is used to calculate 
the product score. The product score represents a balance between fractional overlap and quality in a 
BLAST alignment. For example, a product score of 100 is produced only for 100% identity over the 
entire length of the shorter of the two sequences being compared. A product score of 70 is produced 
either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the 
other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% 

identity and 100% overlap. 

Alternatively, polynucleotides encoding PMMM are analyzed with respect to the tissue 
sources from which they were derived. For example, some full length sequences are assembled, at 
least in part, with overlapping Incyte cDNA sequences (see Example IH). Each cDNA sequence is 
derived from a cDNA library constructed from a human tissue. Each human tissue is classified into 
one of the following organ/tissue categories: cardiovascular system; connective tissue; digestive 
system; embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; 
germ cells; hemic and immune system; liver; musculoskeletal system; nervous system; pancreas; 
respiratory system; sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract 
The number of libraries in each category is counted and divided by the total number of libraries 
across all categories. Similarly, each human tissue is classified into one of the following 
disease/condition categories: cancer, cell line, developmental, inflammation, neurological, trauma, 
cardiovascular, pooled, and olher, and the number of libraries in each category is counted and divided 
by the total number of libraries across all categories. The resulting percentages reflect the tissue- and 
disease-specific expression of cDNA encoding PMMM. cDNA sequences and cDNA library/tissue 
information are found in the LIFESEQ GOLD database (Incyte Genomics, Palo Alto CA). 
Vm. Extension of PMMM Encoding Polynucleotides 

Full length polynucleotides are produced by extension of an appropriate fragment of the full 
length molecule using oligonucleotide primers designed from this fragment. One primer was 
synthesized to initiate 5' extension of the known fragment, and the other primer was synthesized to 
initiate 3' extension of the known fragment. The initial primers were designed using OLIGO 4.06 
software (National Biosciences), or another appropriate program to be about 22 to 30 nucleotides in 
length, to have a GC content of about 50% or more, and to anneal to the target sequence at 
temperatures of about 68°C to about 72°C. Any stretch of nucleotides which would result in hairpin 
structures and primer-primer dimerizations was avoided. 
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Selected human cDNA libraries were used to extend the sequence. If more than one 
extension was necessary or desired, additional or nested sets of primers were designed. 

High fidelity amplification was obtained by PCR using methods well known in the art. PCR 
was performed in 96-well plates using die PTC-200 thermal cycler (MJ Research, Inc.). The reaction 
mix contained DNA template, 200 nmol of each primer, reaction buffer containing Mg\ (NH^SO,,, 
and 2-mercaptoethanol, Taq DNA polymerase (Amersham Biosciences), ELONGASE enzyme 
(Invitrogen), and Pfu DNA polymerase (Stratagene), with the following parameters for primer pair 
PCI A and PCI B: Step 1: 94°C, 3 min; Step 2: 94 e C, 15 sec; Step 3: 60°C, 1 min; Step 4: 68°C, 2 
min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C, 5 min; Step 7: storage at 4°C. In the 
alternative, Ihe parameters for primer pair 17 and SK+ were as follows: Step 1: 94°C, 3 min; Step 2: 
94°C, 15 sec; Step 3: 57 °C, 1 min; Step 4: 68 °C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; 
Step 6: 68 °C, 5 min; Step 7: storage at 4°C. 

The concentration of DNA in each well was determined by dispensing 1 00 /xl PICOGREEN 
quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Probes, Eugene OR) dissolved in IX TE 
and 0.5 nl of undiluted PCR product into each well of an opaque fluorimeter plate (Corning Costar, 
Acton MA), allowing Ihe DNA to bind to the reagent. The plate was scanned in a Fluoroskan II 
(Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 
concentration of DNA A 5 /A to 10 iA aliquot of the reaction mixture was analyzed by 
electrophoresis on a 1 % agarose gel to determine which reactions were successful in extending the 
sequence. 

The extended nucleotides were desalted and concentrated, transferred to 384-weIL plates, 
digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 
sonicated or sheared prior to religation into pUC 18 vector (Amersham Biosciences). For shotgun 
sequencing, the digested nucleotides were separated on low concentration (0.6 to 0.8%) agarose gels, 
fragments were excised, and agar digested with Agar ACE (Promega). Extended clones were 
religated using T4 ligase (New England Biolabs, Beverly MA) into pUC 18 vector (Amersham 
Biosciences), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site overhangs, and 
transfected into competent E. coli cells. Transformed cells were selected on antfoiotic^ontoining 
media, and individual colonies were picked and cultured overnight at 37 °C in 384-well plates in 
LB/2x carb liquid media. 

The cells were lysed, and DNA was amplified by PCR using Taq DNA polymerase 
(Amersham Biosciences) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 
1: 94"C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 72°C, 2 min; Step 5: steps 2, 3, 
and 4 repeated 29 times; Step 6: 72°C, 5 min; Step 7: storage at 4°C. DNA was quantified by 
PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries 
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were reamplified using the same conditions as described above. Samples were diluted with 20% 
dimethysulfoxide (1:2, v/v), and sequenced using DYENAMIC energy transfer sequencing primers 
and the DYENAMIC DIRECT kit (Amersham Biosciences) or the ABI PRISM BIGDYE Terminator 
cycle sequencing ready reaction kit (Applied Biosystems). 
5 In like manner, full length polynucleotides are verified using the above procedure or are used 

to obtain 5* regulatory sequences using the above procedure along with oligonucleotides designed for 
such extension, and an appropriate genomic library. 

IX. Identification of Single Nucleotide Polymorphisms in PMMM Encoding Polynucleotides 

Common DNA sequence variants known as single nucleotide polymorphisms (SNPs) were 

10 identified in SEQ ID NO:32-62 using the LIFESEQ database (Incyte Genomics). Sequences from the 
same gene were clustered together and assembled as described in Example HI, allowing the 
identification of all sequence variants in the gene. An algorithm consisting of a series of filters was 
used to distinguish SNPs from other sequence variants. Preliminary filters removed the majority of 
basecall eiTors by requiring a minimum Phred quality score of 15, and removed sequence alignment 

15 errors and errors resulting from improper trimming of vector sequences, chimeras, and splice 
variants. An automated procedure of advanced chromosome analysis analysed the original 
chromatogram files in the vicinity of the putative SNP. Clone error filters used statistically generated 
algorithms to identify errors introduced during laboratory processing, such as those caused by reverse 
transcriptase, polymerase, or somatic mutation. Clustering error filters used statistically generated 

20 algorithms to identify errors resulting from clustering of close homologs or pseudogenes, or due to 
contamination by non-human sequences. A final set of filters removed duplicates and SNPs found in 
immunoglobulins or T-cell receptors. 

Certain SNPs were selected for further characterization by mass spectrometry' using the high 
throughput MASSARRAY system (Sequenom, Inc.) to analyze allele frequencies at the SNP sites in 

25 four different human populations. The Caucasian population comprised 92 individuals (46 male, 46 
female), including 83 from Utah, four French, three Venezualan, and two Amish individuals. The 
African population comprised 194 individuals (97 male, 97 female), all African Americans. The 
Hispanic population comprised 324 individuals (162 male, 162 female), all Mexican Hispanic. The 
Asian population comprised 126 individuals (64 male, 62 female) with a reported parental breakdown 

30 of 43% Chinese, 31 % Japanese, 13% Korean, 5% Vietnamese, and 8% other Asian. Allele 

frequencies were first analyzed in the Caucasian population; in some cases those SNPs which showed 
no allelic variance in this population were not further tested in the other three populations. 

X. Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO:32-62 are employed to screen cDNAs, 
35 genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base 
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pairs, is specifically described, essentially the same procedure is used with larger nucleotide 
fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 
software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 jtzCi of 
[Y- 32 P] adenosine triphosphate (Amersham Biosciences), and T4 polynucleotide kinase (DuPont NEN, 
Boston MA). The labeled oligonucleotides are substantially purified using a SEPHADEX G-25 
superfine size exclusion dextranbead column (Amersham Biosciences). An aliquot containing 10 7 
counts per minute of the labeled probe is used in a typical membrane-based hybridization analysis of 
human genomic DNA digested with one of the following endonucleases: Ase I, Bgl II, Eco RI, Pst I, 
Xba I, or Pvu n (DuPont NEN). 

The DNA from each digest is fractionated on a 0.7% agarose gel and transferred to nylon 
membranes (Nytran Plus, Schleicher & Schuell, Durham NH). Hybridization is earned out for 16 
hours at 40 °C. To remove nonspecific signals, blots are sequentially washed at room tenq>erature 
under conditions of up to, for example, 0. 1 x saline sodium citrate and 0.5% sodium dodecyl sulfate. 
Hybridization patterns are visualized using autoradiography or an alternative imaging means and 
compared. 

XI. Microarrays 

The linkage or synthesis of array elements upon a microarray can be achieved utilizing 
photolithography, piezoelectric printing (ink-jet printing; see, e.g., Baldeschweiler et aL, supra), 
mechanical microspotting technologies, and derivatives thereof. The substrate in each of the 
aforementioned technologies should be uniform and solid with a non-porous surface (Schena, M., ed. 
< 1999 ) DNAMicroarravs: A Pract ical Approach . Oxford University Press, London). Suggested 
substrates include silicon, silica, glass slides, glass chips, and silicon wafers. Alternatively, a 
procedure analogous to a dot or slot blot may also be used to arrange and link elements to the 
surface of a substrate using thermal, UV, chemical, or mechanical bonding procedures. A typical 
array may be produced using available methods and machines well known to those of ordinary 
skill in the art and may contain any appropriate number of elements (Schena, M. et aL (1995) 
Science 270:467-470; Shalon, D. et al (1996) Genome Res. 6:639-645; Marshall, A. and J. 
Hodgson (1998) Nat. Biotechnol. 16:27-31). 

Full length cDNAs, Expressed Sequence Tags (ESTs), or fragments or oligomers thereof 
may comprise the elements of the microarray. Fragments or oligomers suitable for hybridization 
can be selected using software well known in the art such as LASERGENE software 
(DNASTAR). The array elements are hybridized with polynucleotides in a biological sample. 
The polynucleotides in the biological sample are conjugated to a fluorescent label or other 
molecular tag for ease of detection. After hybridization, nonhybridized nucleotides from the 
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biological sample are removed, and a fluorescence scanner is used to detect hybridization at each 
array element Alternatively, laser desorbtion and mass spectrometry may be used for detection 
of hybridization The degree of complementarity and the relative abundance of each 
polynucleotide which hybridizes to an element on the microarray may be assessed. In one 
5 embodiment, microarray preparation and usage is described in detail below , 
Tissue or Cell Sa m ple Preparation 

Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and 
poly(A) + RNA is purified using the oligo-(dT) cellulose method. Each poly(A) + RNA sample is 
reverse transcribed using MMLV reverse-transcriptase, 0.05 pg//d oligo-(dT) primer (21mer), IX 
10 first strand buffer, 0.03 units//il RNase inhibitor, 500 j<M dATP, 500 fiM dGTP, 500 /iM dTTP, 
40 iM dCTP, 40 fiM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Biosciences). The reverse 
transcription reaction is performed in a 25 ml volume containing 200 ng polyCAj" RNA with 
GEMBRIGHT kits (Incyte Genomics). Specific control poly(A) + RNAs are synthesized by in 
vitro transcription from non-coding yeast genomic DNA After incubation at 3TC for 2 hr, each 
15 reaction sample (one with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M 
sodium hydroxide and incubated for 20 minutes at 85° C to the stop the reaction and degrade the 
RNA. Samples are purified using two successive CHROMA SPIN 30 gel filtration spin columns 
(Clontech, Palo Alto CA) and after combining, both reaction samples are ethanol precipitated 
using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The 
20 sample is then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook NY) : 
and resuspended in 14 /xl 5X SSC/0.2% SDS. 
Microarrav Preparation 

Sequences of the present invention are used to generate array elements. Each array 
element is amplified from bacterial cells containing vectors with cloned cDNA inserts. PCR 
25 amplification uses primers complementary to the vector sequences flanking the cDNA insert. 
Array elements are amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a final 
quantity greater than 5 fig. Amplified array elements are then purified using SEPHACRYL-400 
(Amersham Biosciences). 

Purified array elements are immobilized on polymer-coated glass slides. Glass 
30 microscope slides (Corning) are cleaned by ultrasound in 0. 1 % SDS and acetone, with extensive 
distilled water washes between and after treatments. Glass slides are etched in 4% hydrofluoric 
acid (VWR Scientific Products Corporation (VWR), West Chester PA), washed extensively in 
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distilled water, and coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated 
slides are cured in a 110°C oven. 

Array elements are applied to the coated glass substrate using a procedure described in 
U.S . Patent No. 5 ,807,522, incorporated herein by reference. 1 /d of the array element DNA, at an 
average concentration of 100 ng//il, is loaded into the open capillary printing element by a high- 
speed robotic apparatus. The apparatus then deposits about 5 nl of array element sample per 
slide. 

Microarrays are UV-crossliuked using a STRATALINKER UV-crosslinker (Stratagene). 
Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline (PBS) (Tropix, Inc., Bedford MA) for 30 minutes at 60PC followed by washes in 
0.2% SDS and distilled water as before. 
Hybridization 

Hybridization reactions contain 9 /xl of sample mixture consisting of 0.2 /xg each of Cy3 
and Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The 
sample mixture is heated to 65° C for 5 minutes and is aliquoted onto the microarray surface and 
covered with an 1 . 8 cm 2 coverslip. The arrays are transferred to a waterproof chamber having a 
cavity just slightly larger than a microscope slide. The chamber is kept at 100% humidity 
internally by the addition of 140 jd of 5X SSC in a corner of the chamber. The chamber 
containing the arrays is incubated for about 6.5 hours at 60PC. The arrays are washed for lO min 
at 45° C in a first wash buffer (IX SSC, 0.1% SDS), three times for 10 minutes each at 45°C in a 
second wash buffer (0.1X SSC), and dried. 
Detection 

Reporter-labeled hybridization complexes are detected with a microscope equipped with 
an Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating 
spectral lines at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The 
excitation laser light is focused on the array using a 20X microscope objective (Nikon, Inc., 
Melville NY). The slide containing the array is placed on a computer-controlled X-Y stage on 
the microscope and raster-scanned past the objective. The 1.8 cm x 1.8 cm array used in the 
present example is scanned with a resolution of 20 micrometers. 

In two separate scans, a mixed gas multiline laser excites the two fluorophores 
sequentially. Emitted light is split, based on wavelength, into two photomultiplier tube detector 
(PMT R1477, Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two 
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fluorophores. Appropriate filters positioned between the array and the photomultiplier tubes are 
used to filter the signals. The emission maxima of the fluorophores used are 565 nm for Cy3 and 
650 nm for Cy5. Each array is typically scanned twice, one scan per fluorophore using the 
appropriate filters at the laser source, although the apparatus is capable of recording the spectra 

5 from both fluorophores simultaneously. 

The sensitivity of the scans is typically calibrated using the signal intensity generated by 
a cDNA control species added to the sample mixture at a known concentration. A specific 
location on the array contains a complementary DNA sequence, allowing the intensity of the 
signal at that location to be correlated with a weight ratio of hybridizing species of 1 : 100,000. 

10 When two samples from different sources (e.g., representing test and control cells), each labeled 
with a different fluorophore, are hybridized to a single array for the purpose of identifying genes 
that are differentially expressed, the calibration is done by labeling samples of the calibrating 
cDNA with the two fluorophores and adding identical amounts of each to the hybridization 
mixture. 

15 The output of the photomultiplier tube is digitized using a 12-bit RTI-835H analog-to- 

digital (A/D) conversion board (Analog Devices, Inc., Norwood MA) installed in an IBM- 
compatible PC computer. The digitized data are displayed as an image where the signal intensity 
is mapped using a linear 20-color transformation to a pseudocolor scale ranging from blue (low 
signal) to red (high signal). The data is also analyzed quantitatively. Where two different 

20 fluorophores are excited and measured simultaneously, the data are first corrected for optical 
crosstalk (due to overlapping emission spectra) between the fluorophores using each 
fluorophore' s emission spectrum. 

A grid is superimposed over the fluorescence signal image such that the signal from each 
spot is centered in each element of the grid. The fluorescence signal within each element is then 

25 integrated to obtain a numerical value corresponding to the average intensity of the signal The 
software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte 
Genomics). Array elements that exhibit at least about a two-fold change in expression, a signal- 
to-background ratio of at least about 2.5, and an element spot size of at least about 40%, are 
considered to be differentially expressed. 

30 Expression 

For example, SEQ ID NO:40 showed decreased expression in peripheral blood 
mononuclear cells (PBMCs) treated with PMA and ionomycin versus untreated PBMCs as 
determined by microairay analysis. Peripheral blood mononuclear cells (PBMCs) are isolated 
from freshly obtained peripheral blood. PBMCs are stimulated in vitro with soluble PMA and 
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ionomycin for 1, 2, 4, 8, and 20 hours. These treated cells are compared to untreated PBMCs 
kept in culture. Therefore, in various embodiments, SEQ ID NO:40 can be used for one or more of 
the following: i) monitoring treatment of immune disorders and related diseases and conditions, ii) 
diagnostic assays for immune disorders and related diseases and conditions, and iii) developing 
5 therapeutics and/or other treatments for immune disorders and related diseases and conditions. 

In another example, SEQ ID NO:43 was differentially expressed in human breast tumor cells 
lines as compared to a nonmalignant breast epithelial cell line, MCF-10A. Histological and 
molecular evaluation of breast tumors reveals that the development of breast cancer evolves through a 
multi-step process whereby pre-malignant mammaiy epithelial cells undergo a relatively defined 

10 sequence of events leading to tumor formation. An early event in tumor development is ductal 

hyperplasia. Cells undergoing rapid neoplastic growth gradually progress to invasive carcinoma and 
become metastatic to the lung, bone, and potentially other organs. Several variables that may 
influence the process of tumor progression and malignant transformation include genetic factors, 
environmental factors, growth factors, and hormones. Based on the complexity of this process, it is 

15 critical to study a population of human mammary epithelial cells undergoing the process of malignant 
transformation, and to associate specific stages of progression with phenotypic and molecular 
characteristics. In a cross-comparison study, two cell lines out of nine tested exhibited differential 
expression as compared to controls. BT-20 is a breast carcinoma cell line derived in vitro from cells 
emigrating out of thin slices of the tumor mass isolated from a 74-year old female. MDA-mb-435S is 

20 a spindle shaped strain derived from the pleural effusion of a 3 1 -year old female with metastatic, 
ductal adenocarcinoma of the breast. In this experiment, the expression of SEQ ED NO:43 was 
increased by at least two-fold in these breast tumor cell lines. Therefore, in various embodiments, 
SEQ ID NO:43 can be used for one or more of the following: i) monitoring treatment of breast 
cancer, ii) diagnostic assays for breast cancer, and iii) developing therapeutics and/or other treatments 

25 for breast cancer. 

In another example, SEQ ID NO:43-44 were differentially expressed in three separate 
experiments in which human lung tumor cells were tested in a pair comparison with normal lung from 
the same donor. Lung cancers are divided into four histopathologically distinct groups. Three groups 
(squamous cell carcinoma, adenocarcinoma, and large cell carcinoma) are classified as non-small cell 

30 lung cancers (NSCLCs). The fourth group of cancers is referred to as small cell lung cancer (SCLC). 
Collectively, NSCLCs account for approximately 70% of cases while SCLCs account for 
approximately 1 8% of cases. The molecular and cellular biology underlying the development and 
progression of lung cancer are incompletely understood. Deletions on chromosome 3 are common in 
this disease and are thought to indicate the presence of a tumor suppressor gene in this region 

35 Activating mutations in K-ras are commonly found in lung cancer and are the basis of one of the 
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mouse models for the disease. Analysis of gene expression patterns associated with the development 
and progression of the disease will yield tremendous insight into the biology underlying this disease, 
and will lead to the development of improved diagnostics and therapeutics. In these experiments, the 
expression of SEQ ID NO:43-44 were increased by at least two-fold in the lung tumor cells as 

5 compared to the normal lung tissue cells from the same donor. 

These experiments indicate that SEQ ID NO:43 and SEQ ID NO:44 exhibited significant 
differential expression patterns using microarray techniques. Therefore, in various embodiments, 
SEQ ID NO:43-44 can be used for one or more of the following: i) monitoring treatment of lung 
cancer, ii) diagnostic assays for lung cancer, and iii) developing therapeutics and/or other treatments 

10 for lung cancer. 

In another example, SEQ ID NO:45 was differentially expressed in human breast tumor cell 
lines compared to nonmalignant breast epithelial cell lines. Histological and molecular evaluation of 
breast tumors reveals that the development of breast cancer evolves through a multi-step process 
whereby pre-malignant mammary epithelial cells undergo a relatively defined sequence of events 
15 leading to tumor formation. An early event in tumor development is ductal hyperplasia. Cells 

undergoing rapid neoplastic growth gradually progress to invasive carcinoma and become metastatic 
to fiie lung, bone, and potentially other organs. Several variables that may influence the process of 
tumor progression and malignant transformation include genetic factors, environmental factors, 
growth factors, and hormones. Based on the complexity of this process, it is critical to study a 
20 population of human mammary epithelial cells undergoing the process of malignant transformation. 

In one set of experiments, human primary epithelial breast cells (HMECs) isolated from a 
normal donor were compared to various types of breast cancer cell lines. Of six breast cancer cell 
lines tested, two of these ceU lines, MCF-7 (breast adenocarcinoma) and SK-BR-3 (human breast 
adenocarcinoma, which is also tumorigenic in nude mice) were underexpressed in SEQ ID NO:45 by 
25 at least two-fold as compared to HMEC cells. 

SEQ ID NO:45 was also underexpressed by at least two-fold in MCF-7 breast 
adeonocarcinoma cells as con^ared to nonmalignant MCF10A cells isolated from normal breast 
epithelial tissue. 

These experiments indicate that SEQ ID NO:45 exhibits significant differential expression 
30 patterns using microarray techniques. Therefore, in various embodiments, SEQ ID NO:45 can be 
used for one or more of the following: i) monitoring treatment of breast cancer, ii) diagnostic assays 
for breast cancer, and iii) developing therapeutics and/or other treatments for breast cancer. 

In another example, SEQ ID NO:49 showed differential expression in breast cancer tissue, as 
determined by microarray analysis. In order to better determine the molecular and phenotypic 
35 characteristics associated with different stages of breast cancer, breast carcinoma cell lines at various 
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stages of tumor progression were compared to primary human breast epithelial cells. The breast 
carcinoma cell lines include MCF7, a breast adenocarcinoma cell line derived from the pleural 
effusion of a 69-year-old female; Sk-BR-3, a breast adenocarcinoma cell line isolated from a 
malignant pleural effusion of a 43-year-old female; and BT-20, a breast adenocarcinoma isolated in 
5 vitro from cells emigrating out of thin slices of a tumor mass isolated from a 74-year-old female. The 
primary mammary epithelial cell line HMEC was derived from normal human mammary tissue 
(Clonetics, San Diego, CA). All cell cultures were propagated in a chemically-defined medium, 
according to the supplier's recommendations and grown to 70-80% confluence prior to RNA 
isolation. The microarray experiments showed that expression of SEQ ID NO:49 was decreased by at 

10 least two fold in all three breast carcinoma lines (MCF7, Sk-BR-3, and BT20) relative to primary 
mammary epithelial cells. Therefore, in various embodiments, SEQ ID NO:49 can be used for one or 
more of the following: i) monitoring treatment of breast cancer, ii) diagnostic assays for breast 
cancer, and iii) developing therapeutics and/or other treatments for breast cancer. 

SEQ ID NO:49 also showed differential expression, as determined by microarray analysis, in 

15 liver C3A cells treated with one of the following steroids: beclomethasone, dexamethasone, 

progesterone, budesonide. The human C3 A cell line is a clonal derivative of HepG2/C3 and has been 
established as an in vitro model of the mature human liver (Mickelson et al. (1995) Hepatology 
22:866-875; Nagendra et al. (1997) Am J Physiol 272:G408-G416). SEQ ID NO:5 showed at least a 
two-fold decrease in expression at a minimum of two out of the three time points in early confluent 

20 C3A cells treated with beclomethasone, budesonide, dexamethasone, or betamethasone, for 1, 3, or 6 
hours. These experiments indicate that SEQ ID NO:49 is useful in diagnostic assays for liver 
diseases and as a potential biological marker and therapeutic agent in the treatment of liver diseases 
and disorders. Therefore, in various embodiments, SEQ ID NO:49 can be used for one or more of the 
following: i) monitoring treatment of liver diseases and disorders, ii) diagnostic assays for liver 
25 diseases and disorders, and iii) developing therapeutics and/or other treatments for liver diseases and 
disorders. 

In another example, SEQ ID NO:51 showed differential expression, as detennined by 
microarray analysis, in Alzheimer's Disease (AD). In a comparison of cerebellum tissue from a 76- 
year-old male with severe AD to cerebellum tissue from a normal 67-year-old male, the expression of 
30 SEQ ID NO:51 was decreased at least two-fold. Therefore, in various embodiments, SEQ ID NO:51 
can be used for one or more of the following: i) monitoring treatment of Alzheimer's Disease, ii) 
diagnostic assays for Alzheimer's Disease, and iii) developing therapeutics and/or other treatments 
for Alzheimer's Disease. 

SEQ ED NO:51 also showed differential expression associated with colon cancer, as 
35 determined by microarray analysis. Normal colon tissue was compared to colon tumor tissue from a 
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67^year-old donor with moderately differentiated adenocarcinoma. The expression of SEQ ID NO:5 1 
was decreased at least two-fold in the tumor tissue as compared to the normal tissue. Therefore, in 
various embodiments, SEQ ID NO:51 can be used for one or more of the following: i) monitoring 
treatment of colon cancer, ii) diagnostic assays for colon cancer, and iii) developing therapeutics 

5 and/or other treatments for colon cancer. 

In another example, the expression of SEQ ID NO:56 in a primary prostate epithelial cell line 
isolated from a normal donor, PrEC, was compared to that in three prostate carcinoma cell lines. DU 
145 is a prostate carcinoma cell line isolated from a metastatic site in the "brain of a 69 year old male 
with widespread metastatic prostate carcinoma. DU 145 has no detectable sensitivity to hormones; 

10 forms colonies in semi-solid medium, is only weakly positive for acid phosphatase, and is negative 
for prostate specific antigen. LNCaP is a prostate carcinoma cell line isolated from a lymph node 
biopsy of a 50 year old male with metastatic prostate carcinoma. LNCaP cells express prostate 
specific antigens, produce prostatic acid phosphatase, and express androgen receptors. PC-3 is a 
prostate adenocarcinoma cell line isolated from a metastatic site in the bone of a 62 year old male 

15 with grade IV prostate adenocarcinoma. The expression of SEQ ID NO:56 was increased by at least 
two-fold in DU 145 cells grown under restrictive conditions as compared to PrEC cells grown under 
restrictive conditions. Therefore, in various embodiments, SEQ ID NO:56 can be used for one or 
more of the following: i) monitoring treatment of prostate cancer, ii) diagnostic assays for prostate 
cancer, and iii) developing therapeutics and/or other treatments for prostate cancer. 

20 In another example, SEQ ID NO:58, SEQ ID NO:59 and SEQ ID NO:60 showed differential 

expression associated with breast cancer, as determined by microarray analysis. The gene expression 
profile of a nonmalignant mammary epithelial cell line was compared to the gene expression profiles 
of breast carcinoma lines at different stages of tumor progression. Cell lines compared included: a) 
BT-20, a breast carcinoma cell line derived in vitro from the cells emigrating out of thin slices of 

25 tumor mass isolated from a 74-year-old female, b) BT-474, a breast ductal carcinoma cell line that 
was isolated from a solid, invasive ductal carcinoma of the breast obtained from a 60-year-old 
woman, c) BT-483, a breast ductal carcinoma cell line that was isolated from a papillary invasive 
ductal tumor obtained from a 23-year-old normal, menstruating, parous female with a family history 
of breast cancer, d) Hs578T, a breast ductal carcinoma cell line isolated from a 74-year-old female 

30 with breast carcinoma, e) MCF7, a nonmalignant breast adenocarcinoma cell line isolated from the 
pleural effusion of a 69-year-old female, f) MCF-10A, a breast mammary gland (luminal ductal 
characteristics) cell line isolated, from a 36-year-old woman with fibrocystic breast disease, g) MDA- 
nib-435S, a spindle-shaped strain that evolved from the parent line (435) isolated by R. Cailleau from 
pleural effusion of a 31 -year-old female with metastatic, ductal adenocarcinoma of the breast, h) Sk- 

35 BR-3, a breast adenocarcinoma cell line isolated from a malignant pleural effusion of a 43-year-old 
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female, i) T-47D, a breast carcinoma cell line isolated from a pleural effusion obtained from a 54- 
year-old female with an infiltrating ductal carcinoma of the breast and j) HMEC, a primary breast 
epithelial cell line isolated from a normal donor. SEQ ID NO:58 expression was reduced by at least 
two-fold in BT20 and MCF7 cells as compared to HMEC cells. The expression of SEQ ID NO:59 
5 was decreased by at least two-fold in carcinoma cell lines BT20, Sk-BR-3, T-47D, MDA-mb-435S 
and MCI 7 ? as compared to HMEC cells. SEQ ID NO:60 expression was upregulated by at least two- 
fold in the carcinoma cell line Hs578T as compared to the HMEC cell line. Therefore, in various 
embodiments, SEQ ID NO:58, SEQ ID NO:59 and SEQ ID NO:60 can be used for one or more of the 
following: i) monitoring treatment of breast cancer, ii) diagnostic assays for breast cancer, and iii) 

10 developing therapeutics and/or other treatments for breast cancer. 

In another example, SEQ ID NO:60 showed differential expression associated with lung 
cancer, as determined by microarray analysis. Expression was compared in matched samples of 
normal and lung tumor tissue from individual donors. Tissue samples were provided by the Roy 
Castle International Centre for Lung Cancer Research. SEQ ID NO:60 expression was upregulated 

15 by at least two-fold in lung squamous cell carcinoma tissue derived from a 68-year-old female donor 
as compared to normal lung tissue from the same donor. Therefore, in various embodiments, SEQ ID 
NO:60 can be used for one or more of the following: i) monitoring treatment of lung cancer, ii) 
diagnostic assays for lung cancer, and iii) developing therapeutics and/or other treatments for lung 
cancer. 

20 In another exanqde, SEQ ID NO:58 and SEQ ID NO:59 showed differential expression 

associated with ovarian cancer, as determined by microarray analysis. A normal ovary from a 79 
year-old female donor was compared to an ovarian tumor from the same donor (Huntsman Cancer 
Institute, Salt Lake City, UT). The expression of SEQ ID NO:58 and SEQ ID NO:59 was decreased 
by at least two-fold in the tumor tissue as compared to the normal tissue. Therefore, SEQ ID NO:58 

25 and SEQ ID NO:59 are useful in monitoring treatment of, and diagnostic assays for ovarian cancer. 
Therefore, in various embodiments, SEQ ID NO:58-59 can be used for one or more of the following: 
i) monitoring treatment of ovarian cancer, ii) diagnostic assays for ovarian cancer, and iii) developing 
therapeutics and/or other treatments for ovarian cancer. 

In another example, SEQ ID NO:59 showed differential expression associated with steroid 

30 hormone responses, as determined by microarray analysis. The human C3 A cell line is a clonal 
derivative of HepG2/C3 (hepatoma cell line, isolated from a 15-year-old male with liver tumor), 
which was selected for strong contact inhibition of growth. The use of a clonal population enhances 
the reproducibility of the cells. C3 A cells have many characteristics of primary human hepatocytes in 
culture: i) expression of insulin receptor and insulin-like growth factor II receptor; ii) secretion of a 

35 high ratio of serum albumin compared with a-fetoprotein iii) conversion of ammonia to urea and 
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glutamine; iv) metabolism of aromatic amino acids; and v) proliferation in glucose-free and insulin- 
free medium The C3 A cell line is now well established as an in vitro model of the mature human 
liver (Mickelson et al. (1995) Hepatology 22:866-875; Nagendra et aL (1997) Am J Physiol 
272:G408-G416). Early Confluent C3 A cells were treated with progesterone or budenoside at 1 , 10, 

5 and 100 pM for 1 , 3, and 6 hours. The treated cells were compared to untreated early confluent C3A 
cells. At each of the time points, the expression of SEQ ID NO:59 was decreased by at least two-fold 
in C3A cells treated with 10 or 100 MM budenoside, and in C3A cells treated wth 10 \xM 
progesterone. Therefore, SEQ ID NO:59 may be useful in monitoring of, and diagnostic assays for 
steroid hormone-induced responses. Therefore, in various embodiments, SEQ ID NO:59 can be used 

10 for one or more of the following: i) monitoring treatment of steroid hormone-induced responses, ii) 
diagnostic assays for steroid hormone-induced responses, and iii) developing therapeutics and/or 
other treatments for steroid hormone-induced responses. 

In another example, SEQ ID NO:61 showed differential expression associated with lung 
cancer, as determined by microarray analysis. Pair comparisons of lung tumor tissue and 

15 microscopically-normal tissue from the same donor were made. The expression of SEQ ID NO:61 
was increased by at least two-fold in lung squamous cell carcinoma tissue from a 68 year-old female 
as conyared to normal lung tissue from the same donor (Roy Castle International Centre for Lung 
Cancer Research, Liverpool, UK). Therefore, in various embodiments, SEQ ID NO:61 can be used 
for one or more of the following: i) monitoring treatment of lung cancer, ii) diagnostic assays for lung 

20 cancer, and iii) developing therapeutics and/or other treatments for lung cancer. 

XII. Complementary Polynucleotides 

Sequences complementary to the PMMM-encoding sequences, or any parts thereof, are used 
to detect, decrease, or inhibit expression of naturally occurring PMMM. Although use of 
oligonucleotides con^rising from about 15 to 30 base pairs is described, essentially the same 

25 procedure is used with smaller or with larger sequence fragments. Appropriate oligonucleotides are 
designed using OLIGO 4.06 software (National Biosciences) and the coding sequence of PMMM. To 
inhibit transcription, a complementary oligonucleotide is designed from the most unique 5' sequence 
and used to prevent promoter binding to the coding sequence. To inhibit translation, a 
complementary oligonucleotide is designed to prevent ribosomal binding to the PMMM-encoding 

30 transcript. 

XIII. Expression of PMMM 

Expression and purification of PMMM is achieved using bacterial or virus-based expression 
systems. For expression of PMMM in bacteria, cDNA is subcloned into an appropriate vector 
containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA 
35 transcription Exanyles of such promoters include, but are not limited to, the tiy-lac {lac) hybrid 
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promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory 
element. Recombinant vectors are transformed into suitable bacterial hosts, e.g., BL21(DE3). 
Antibiotic resistant bacteria express PMMM upon induction with isopropyl beta-D- 
thiogalactopyranoside (IPTG). Expression of PMMM in eukaryotic cells is achieved by infecting 
5 insect or mammalian cell lines with recombinant Autographica califomica nuclear polyhedrosis virus 
(AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is 
replaced with cDNA encoding PMMM by either homologous recombination or bacterial-mediated 
transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong 
polyhedrin promoter drives high levels of cDNA transcription. Recombinant bacidovirus is used to 
10 infect Spodoptera frugiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases. 
Infection of the latter requires additional genetic modifications to baculovirus (Engelhard, E.K. et al. 
(1994) Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum Gene Ther. 7:1937- 
1945). 

In most expression systems, PMMM is synthesized as a fusion protein with, e.g., glutathione 

15 S-transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, 
affinity-based purification of recombinant fusion protein from crude cell lysates. GST, a 26- 
kilodalton enzyme from Schistosoma japonicum, enables the purification of fusion proteins on 
immobilized glutathione under conditions that maintain protein activity and antigenicity (Amersham 
Biosciences). Following purification, the GST moiety can be proteolytically cleaved from PMMM at 

20 specifically engineered sites. FLAG, an 8-amino acid peptide, enables imruunoaffinity purification 
using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 6- 
His, a stretch of six consecutive histidine residues, enables purification on metal-chelate resins 
(QIAGEN). Methods for protein expression and purification are discussed in Ausubel et al. (supra, 
ch. 10 and 16). Purified PMMM obtained by these methods can be used directly in the assays shown 

25 in Examples XVH, XVm, XIX, and XX, where applicable. 
XIV. Functional Assays 

PMMM function is assessed by expressing the sequences encoding PMMM at 
physiologically elevated levels in mammalian cell culture systems. cDNA is subcloned into a 
mammalian expression vector containing a strong promoter that drives high levels of cDNA 

30 expression. Vectors of choice include PCMV SPORT plasmid (Invitrogen, Carlsbad CA) and 
PCR3.1 plasmid (Invitrogen), both of which contain the cytomegalovirus promoter. 5-10 [ig of 
recombinant vector are transiently transfected into a human cell line, for example, an endothelial or 
hematopoietic cell line, using either liposome formulations or electroporation. 1-2 /xg of an 
additional plasmid containing sequences encoding a marker protein are co-transfected. Expression of 

35 a marker protein provides a means to distinguish transfected cells from nontransfected cells and is a 
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reliable predictor of cDNA expression from the recombinant vector. Marker proteins of choice 
include, e.g., Green Fluorescent Protein (GFP; Clontech), CD64, or a CD64-GFP fusion protein. 
Flow cytometry (FCM), an automated, laser optics-based technique, is used to identify transfected 
cells expressing GFP or CD64-GFP and to evaluate the apoptotic state of the cells and other cellular 
5 properties. FCM detects and quantifies the uptake of fluorescent molecules that diagnose events 
preceding or coincident with cell death. These events include changes in nuclear DNA content as 
measured by staining of DNA with propidium iodide; changes in cell size and granularity as measured 
by forward light scatter and 90 degree side light scatter; down-regulation of DNA synthesis as 
measured by decrease in bromodeoxyuridine uptake; alterations in expression of cell surface and 
10 intracellular proteins as measured by reactivity with specific antibodies; and alterations in plasma 
membrane composition as measured by the binding of fluorescem-conjugated Annexin V protein to 
the cell surface. Methods in flow cytometry are discussed in Ormerod, M.G. (1994; Flow Cytometry, 
Oxford, New York NY). 

The influence of PMMM on gene expression can be assessed using highly purified 
15 populations of cells transfected with sequences encoding PMMM and either CD64 or CD64-GFP. 
CD64 and CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions 
of human immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected 
cells using magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake 
Success NY). mRNA can be purified from the cells using methods well known by those of skill in 
20 the art. Expression of mRNA encoding PMMM and other genes of interest can be analyzed by 
northern analysis or microarray techniques. 
XV. Production of PMMM Specific Antibodies 

PMMM substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 
Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 
25 immunize animals (e.g., rabbits, mice, etc.) and to produce antibodies using standard protocols. 

Alternatively, the PMMM amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogenicity, and a corresponding oligopeptide is 
synthesized and used to raise antibodies by means known to those of skill in the art. Methods for 
selection of appropriate epitopes, such as those near the C-terminus or inhydrophilic regions are well 
30 described in the art (Ausubel et al. , supra, ck 1 1). 

Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 43 1 A 
peptide synthesizer (Applied Biosystems) using FMOC chemistry and coupled to KLH (Sigma- 
Aldrich, St. Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to 
increase immunogenicity (Ausubel et al., supra). Rabbits are immunized with the oligopeptide-KLH 
35 complex in complete Freund's adjuvant. Resulting antisera are tested for antipeptide and anti-PMMM 
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activity by, for example, binding the peptide or PMMM to a substrate, blocking with 1 % BS A, 
reacting with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. 
XVI. Purification of Naturally Occurring PMMM Using Specific Antibodies 

Naturally occurring or recombinant PMMM is substantially purified by immunoaffinity 

5 chromatography using antibodies specific for PMMM. An immunoaffinity column is constructed by 
covalently coupling anti-PMMM antibody to an activated chromatographic resin, such as 
CNBr-activated SEPHAROSE (Amersham Biosciences). After the coupling, the resin is blocked and 
washed according to the manufacturer's instructions. 

Media containing PMMM are passed over the immunoaffinity column, and the column is 

10 washed under conditions that allow the preferential absorbance of PMMM (e.g., high ionic strength 
buffers in the presence of detergent). The column is eluted under conditions that disrupt 
antibody/PMMM binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such 
as urea or thiocyanate ion), and PMMM is collected. 
XVIL Identification of Molecules Which Interact with PMMM 

15 PMMM, or biologically active fragments thereof, are labeled with 125 I Bolton-Hunter reagent 

(Bolton, A.E. and W.M. Hunter (1973) Biochem J. 133:529-539). Candidate molecules previously 
arrayed in the wells of a multi-well plate are incubated with the labeled PMMM, washed, and any 
wells with labeled PMMM complex are assayed. Data obtained using different concentrations of 
PMMM are used to calculate values for the number, affinity, and association of PMMM with the 

20 candidate molecules. 

Alternatively, molecules interacting with PMMM are analyzed using the yeast two-hybrid 
system as described in Fields, S. and O. Song (1989; Nature 340:245-246), or using commercially 
available kits based on the two-hybrid system, such as the MATCHMAKER system (Clontech). 

PMMM may also be used in the PATHCALLING process (CuraGen Corp., New Haven CT) 

25 which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions 
between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. 
Patent No. 6,057,101). 
XVIII. Demonstration of PMMM Activity 

PMMM activity can be demonstrated using a generic iinmunoblotting strategy or through a 

30 variety of specific activity assays, some of which are outlined below. As a general approach, cell 
lines or tissues transformed with a vector containing PMMM coding sequences can be assayed for 
PMMM activity by immunoblotting. Transformed cells are denatured in SDS in the presence of b- 
mercaptoethanol, nucleic acids are removed by ethanol precipitation, and proteins are purified by 
acetone precipitation Pellets are resuspended in 20 mM Tris buffer at pH 7.5 and incubated with 

35 Protein G-Sepharose pre-coated with an antibody specific for PMMM. After washing, the Sepharose 



in 



WO 03/025131 



PCT/US02/29221 



beads are boiled in electrophoresis sample buffer, and the eluted proteins subjected to SDS-PAGE. 
The SDS-PAGE is transferred to a membrane for immunoblotting, and the PMMM activity is 
assessed by visualizing and quantifying bands on the blot using the antibody specific for PMMM as 
the primary antibody and ^I-labeled IgG specific for the primary antibody as the secondary antibody. 

PMMM kinase activity is measured by quantifying the phosphorylation of a protein substrate 
by PMMM in the presence of gamma-labeled 32 P-ATP. PMMM is incubated with the protein 
substrate, 32 P-ATP, and an appropriate kinase buffer. The 32 P incorporated into the substrate is 
separated from free 32 P-ATP by electrophoresis and the incorporated 32 P is counted using a 
radioisotope counter. The amount of incorporated 32 P is proportional to the activity of PMMM. A 
determination of the specific amino acid residue phosphorylated is made by phosphoamino acid 
analysis of thehydrolyzed protein. 

In one alternative, PMMM activity is demonstrated by a test for galactosyltransferase 
activity. This can be determined by measuring the transfer of radiolabeled galactose from UDP- 
galactose to a GlcNAc-terminated oligosaccharide chain (Kolbinger, F. et aL (1998) J. Biol. Chem 
273:58-65). The sanple is incubated with 14 fd of assay stock solution (180 mM sodium cacodylate, 
pH 6.5, 1 mg/irii bovine serum albumin, 0.26 mM UDP-galactose, 2 fil of UDP-[ 3 H]galactose), 1 /il of 
. MnCl 2 (500 mM), and 2.5 fil of GlcNAcpO-(CH 2 ) r C0 2 Me (37 mg/ml in dimethyl sulfoxide) for 60 
minutes at 37 °C. The reaction is quenched by the addition of 1 ml of water and loaded on a CI 8 Sep- 
. Pak cartridge (Waters), and the column is washed twice with 5 mi of water to remove unreacted UDP- 
[ 3 H]galactose. The [ 3 H]galactosylated GlcNAcpO^CH^-CO^e remains bound to the column during 
the water washes and is eluted with 5 ml of methanol. Radioactivity in the eluted material is 
measured by liquid scintillation counting and is proportional to galactosyltransferase activity in the 
starting sample. 

PMMM phosphatase activity is measured by the hydrolysis of p-nitrophenyl phosphate 
(PNPP). PMMM is incubated together with PNPP in HEPES buffer, pH 7.5, in the presence of 0. 1 % 
p-mercaptoethanol at 37 °C for 60 nrin. The reaction is stopped by the addition of 6 ml of 10 N NaOH 
and the increase in light absorbance at 410 nm resulting from the hydrolysis of PNPP is measured 
using a spectrophotometer. The increase in light absorbance is proportional to the activity of PMMM 
in the assay (Diamond, R.H. et aL (1994) Mol. Cell. Biol. 14:3752-3762). 

In the alternative, PMMM phosphatase activity is determined by measuring the amount of 
phosphate removed from a phosphorylated protein substrate. Reactions are performed with 2 or 4 nM 
enzyme in a final volume of 30 fil containing 60 mM Tris, pH 7.6, 1 mM EDTA, 1 mM EGTA, 0.1% 
2-mercaptoethanol and 10 fiM substrate, 32 P-labeled on serine/threonine or tyrosine, as appropriate. 
Reactions are initiated with substrate and incubated at 30° C for 10-15 min. Reactions are quenched 
with 450 Ml of 4% (w/v) activated charcoal in 0.6 M HC1, 90 mM Na 4 P 2 0 7 , and 2 mM NaH 2 P0 4 , then 
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centrifuged at 12,000 x g for 5 min. Acid-soluble 32 Pi is quantified by liquid scintillation counting 
(Sinclair, C. et al. (1999) J. Biol. Chem. 274:23666-23672). 

PMMM protease activity is measured by the hydrolysis of appropriate synthetic peptide 
substrates conjugated with various chromogenic molecules in which the degree of hydrolysis is 
5 quantified by spectrophotometry (or fluorometric) absorption of the released chromophore (Beynon, 
RJ. and J.S. Bond (1994 ) Proteolytic Enzymes: A Practical Approach. Oxford University Press, New 
York, NY, pp. 25-55). Peptide substrates are designed according to the category of protease activity 
as endopeptidase (serine, cysteine, aspartic proteases, or metalloproteases), aminopeptidase (leucine 
aminopeptidase), or carboxypeptidase (carboxypeptidases A and B, procollagen C-proteinase). 
10 Commonly used chromogens are 2-naphthylairdne, 4-nitroaniline, and furylacrylic acid. Assays are 
performed at ambient temperature and contain an aliquot of the enzyme and the appropriate substrate 
in a suitable buffer. Reactions are carried out in an optical cuvette, and the increase/decrease in 
absorbance of the chromogen released during hydrolysis of the peptide substrate is measured. The 
change in absorbance is proportional to the enzyme activity in the assay. 
15 In the alternative, an assay for PMMM protease activity takes advantage of fluorescence 

resonance energy transfer (FRET) that occurs when one donor and one acceptor fluorophore with an 
appropriate spectral overlap are in close proximity. A flexible peptide linker containing a cleavage 
site specific for PMMM is fused between a red-shifted variant (RSGFP4) and a blue variant (BFP5) 
of Green Fluorescent Protein. This fusion protein has spectral properties that suggest energy transfer 
20 is occurring from BFP5 to RSGFP4. When the fusion protein is incubated with PMMM, the substrate 
is cleaved, and the two fluorescent proteins dissociate. This is accompanied by a marked decrease in 
energy transfer which is quantified by con^aring the emission spectra before and after the addition of 
PMMM (Mitra, R.D. et al (1996) Gene 173:13-17). This assay can also be performed in living cells. 
In this case the fluorescent substrate protein is expressed constitutively in cells and PMMM is 
25 introduced on an inducible vector so that FRET can be monitored in the presence and absence of 
PMMM (Sagot, I. et al (1999) FEBS Letters 447:53-57). 

An assay for ubiquitin hydrolase activity measures the hydrolysis of a ubiquitin precursor. 
The assay is performed at ambient temperature and contains an aliquot of PMMM and the appropriate 
substrate in a suitable buffer. Chemically synthesized human ubiquitin-valine may be used as 
30 substrate. Cleavage of the C-terminal valine residue from the substrate is monitored by capillary 
electrophoresis (Franklin, K. et al. (1997) Anal. Biochem 247:305-309). 

PMMM protease inhibitor activity for alpha 2-HS-glycoprotein (AHSG) can be measured as a 
decrease in osteogenic activity in dexamethasone-treated rat bone marrow cell cultures (dex-RBMC). 
Assays are carried out in 96-well culture plates containing minimal essential medium supplemented 
35 with 15% fetal bovine serum, ascorbic acid (50 mg/ml), antibiotics (100 mg/ml penicillin G, 50 
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mg/ml gentamicin, 0.3 mg/ml fungizone), 10 mM B-glycerophosphate, dexamethasone (10" 8 M) and 
various concentrations of PMMM for 12-14 days. Mineralized tissue formation in the cultures is 
quantified by measuring the absorbance at 525 nm using a 96-well plate reader (Binkert, C. et al. 
(1999) J. Biol. Chem. 274:28514-28520). 

5 PMMM protease inhibitor activity for inter-alpha-trypsin inhibitor (ITI) can be measured by a 

continuous spectrophotometry rate determination of trypsin activity. The assay is performed at 
ambient temperature in a quartz cuvette in pH 7.6 assay buffer containing 63 mM sodium phosphate, 
0.23 mM N a-benzoyle-L-arginine ethyl ester, 0.06 mM hydrochloric acid, 100 units trypsin, and 
various concentrations of PMMM. Immediately after mixing by inversion, the increase in is 

10 recorded for approximately 5 minutes and the enzyme activity is calculated (Bergmeyer, H.U. et al. 
(1974) Meth. Enzym. Anal. 1:515-516). 

PMMM isomerase activity such as peptidyl prolyl cis/trans isomerase activity can be assayed 
by an enzyme assay described by Rahfeld, J.U., et al. (1994; FEBS Lett. 352:1 80-184). The assay is 
performed at 10°C in 35 mM HEPES buffer, pH 7.8, containing chymotrypsin (0.5 mg/ml) and 

15 PMMM at a variety of concentrations. Under these assay conditions, the substrate, Suc-Ala-Xaa-Pro- 
Phe-4-NA, is in equilibrium with respect to the prolyl bond, with 80-95% in trans and 5-20% in cis 
conformation An aliquot (2 ml) of the substrate dissolved in dimethyl sulfoxide (10 mg/ml) is added 
to the reaction mixture described above. Only the cis isomer of the substrate is a substrate for 
cleavage by chymotrypsin. Thus, as the substrate is isomerized by PMMM, the product is cleaved by 

20 chymotrypsin to produce 4-nitroanilide, which is detected by it's absorbance at 390 nm 4- 
nitroanilide appears in a time-dependent and a PMMM concentration-dependent manner. 

PMMM galactosyltransferase activity can be determined by measuring the transfer of 
radiolabeled galactose from UDP-galactose to a GlcNAc-terminated oligosaccharide chain 
(Kolbinger, F. et al. (1998) J. Biol. Chem 273:58-65). The sample is incubated with 14 //I of assay 

25 stock solution (1 80 mM sodium cacodylate, pH 6.5, 1 mg/ml bovine serum albumin, 0.26 mM UDP- 
galactose, 2 ill of UDP-[ 3 H]galactose), 1 fil of MnCl 2 (500 mM), and 2.5 fil of GlcNAcpO^H^- 
C0 2 Me (37 mg/ml in dimethyl sulfoxide) for 60 minutes at 37 °C. The reaction is quenched by the 
addition of 1 ml of water and loaded on a CI 8 Sep-Pak cartridge (Waters), and the column is washed 
twice with 5 ml of water to remove unreacted UDP-PH]galactose. The [ 3 H]galactosylated 

30 GlcNAcpO-tCHjVCO^e remains bound to the column during the water washes and is eluted with 5 
ml of methanol. Radioactivity in the eluted material is measured by liquid scintillation counting and 
is proportional to galactosyltransferase activity in the starting sample. 

PMMM induction by heat or toxins may be demonstrated using primary cultures of human 
fibroblasts or human cell lines such as CCL-13, HEK293, or HEP G2 (ATCC). To heat induce 

35 PMMM expression, aliquots of cells are incubated at 42°C for 15, 30, or 60 minutes. Control aliquots 
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are incubated at 37°C for the same time periods. To induce PMMM expression by toxins, aliquots of 
cells are treated with 100 /iM arsenite or 20 mM azetidine-2-carboxylic acid for 0, 3, 6, or 12 hours. 
After exposure to heat, arsenite, or the amino acid analogue, samples of the treated cells are harvested 
and cell lysates prepared for analysis by western blot. Cells are lysed in lysis buffer containing 1 % 
5 Nonidet P-40, 0. 15 M NaCl, 50 mM Tris-HCl, 5 mM EDTA, 2 mM N-ethylmaleiirride, 2 mM 

phenylmethylsulfonyl fluoride, 1 mg/ml leupeptin, and 1 mg/ml pepstatin Twenty micrograms of the 
cell lysate is separated on an 8% SDS-PAGE gel and transferred to a membrane. After blocking with 
5% nonfat dry milk/phosphate-buffered saline for 1 h, the membrane is incubated overnight at 4°C or 
at room temperature for 2-4 hours with an appropriate dilution of anti-PMMM serum in 2% nonfat 

10 dry milk/phosphate-buff ered saline. The membrane is then washed and incubated with a 1 : 1000 

dilution of horseradish peroxidase-conjugated goat anti-rabbit IgG in 2% dry milk/phosphate-buffered 
saline. After washing with 0. 1 % Tween 20 in phosphate-buffered saline, the PMMM protein is 
detected and compared to controls using chemiluminescence. 

PMMM lysyl hydroxylase activity is determined by measuring the production of 

15 hydroxy[ 14 C]lysine from [ 14 C]lysine. Radiolabeled protocollagen is incubated with PMMM in buffer 
containing ascorbic acid, iron sulfate, dithiothreitol, bovine serum albumin, and catalase. Following a 
30 minute incubation, the reaction is stopped by the addition of acetone, and centrifuged. The 
sedimented material is dried, and the hydroxy[ 14 CJiysine is converted to [ 14 C]formaldehyde by 
oxidation with periodate, and then extracted into toluene. The amount of 14 C extracted into toluene is 

20 quantified by scintillation counting, and is proportional to the activity of PMMM in the sample 
(Kivirikko, K., and R. Myllyla (1982) Methods Enzymol. 82:245-304). 
XEX. Identification of PMMM Substrates 

Phage display libraries can be used to identify optimal substrate sequences for PMMM. A 
random hexamer followed by a linker and a known antibody epitope is cloned as an N-terminal 

25 extension of gene HI in a filamentous phage library. Gene IE codes for a coat protein, and the epitope 
will be displayed on the surface of each phage particle. The library is incubated with PMMM under 
proteolytic conditions so that the epitope will be removed if the hexamer codes for a PMMM 
cleavage site. An antibody that recognizes the epitope is added along with immobilized protein A 
Uncleaved phage, which still bear the epitope, are removed by centrifugation. Phage in the 

30 supernatant are then amplified and undergo several more rounds of screening. Individual phage 

clones are then isolated and sequenced. Reaction kinetics for these peptide substrates can be studied 
using an assay in Example XVm, and an optimal cleavage sequence can be derived (Ke, S.H. et al. 
(1997) J. Biol. Chem 272:16603-16609). 

To screen for in vivo PMMM substrates, this method can be expanded to screen a cDNA 

35 expression library displayed on the surface of phage particles (T7SELECT10-3 Phage display vector, 
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Novagen, Madison, WI) or yeast cells (pYDl yeast display vector kit, Invitrogen, Carlsbad, CA). In 
this case, entire cDNAs are fused between Gene III and the appropriate epitope. 
XX. Identification of PMMM Inhibitors 

Compounds to be tested are arrayed in the wells of a multi-well plate in varying 

5 concentrations along with an appropriate buffer and substrate, as described in the assays in Example 
XVIII. PMMM activity is measured for each well and the ability of each compound to inhibit 
PMMM activity can be determined, as well as the dose-response kinetics. This assay could also be 
used to identify molecules which enhance PMMM activity. 

In the alternative, phage display libraries can be used to screen for peptide PMMM inhibitors. 

10 Candidates are found among peptides which bind tightly to a protease. In this case, multi-well plate 
wells are coated with PMMM and incubated with a random peptide phage display library or a cyclic 
peptide library (Koivunen, E. et al. (1999) Nature Biotech 17:768-774). Unbound phage are washed 
away and selected phage amplified and rescreened for several more rounds. Candidates are tested for 
PMMM inhibitory activity using an assay described in Example XVIQ. 

15 

Various modifications and variations of the described compositions, methods, and systems of 
the invention will be apparent to those skilled in the art without departing from the scope and spirit of 
the invention. It will be appreciated that the invention provides novel and useful proteins, and their 
encoding polynucleotides, which can be used in the drug discovery process, as well as methods for 

20 using these compositions for the detection, diagnosis, and treatment of diseases and conditions. 
Although the invention has been described in connection with certain embodiments, it should be 
understood that the invention as claimed should not be unduly limited to such specific embodiments. 
Nor should the description of such embodiments be considered exhaustive or limit the invention to 
the precise forms disclosed. Furthermore, elements from one embodiment can be readily recombined 

25 with elements from one or more other embodiments. Such combinations can form a number of 
embodiments within the scope of the invention. It is intended that the scope of the invention be 
defined by the following claims and their equivalents. 
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5. An isolated polynucleotide of claim 4 comprising a polynucleotide sequence selected from 
the group consisting of SEQ ID NO:32-62. 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

7. A cell transformed with a recombinant polynucleotide of claim 6. 

8. A transgenic organism comprising a recombinant polynucleotide of claim 6. 

9. A method of producing a polypeptide of claim 1, the method comprising: 

a) culturing a cell under conditions suitable for expression of the polypeptide, wherein 
said cell is transformed with a recombinant polynucleotide, and said recombinant 
polynucleotide comprises a promoter sequence operably linked to a polynucleotide 
encoding the polypeptide of claim 1, and 

b) recovering the polypeptide so expressed. 

10. A method of claim 9, wherein the polypeptide comprises an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-3L 

11. An isolated antibody which specifically binds to a polypeptide of claim 1. 

12. An isolated polynucleotide selected from the group consisting of: 

a) a polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of SEQ ID NO:32-62, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consisting of SEQ 
ID NO:32-41, SEQ ID NO:43-56, and SEQ ID NO:61-62, 

c) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
92% identical to the polynucleotide sequence of SEQ ID NO:42, 

d) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
97% identical to the polynucleotide sequence of SEQ ID NO:59, 
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e) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
98% identical to a polynucleotide sequence selected from the group consisting of SEQ 
ID NO:58 and SEQ ID NO:60, 

f) a polynucleotide complementary to a polynucleotide of a), 

g) a polynucleotide complementary to a polynucleotide of b), 

h) a polynucleotide complementary to a polynucleotide of c), 

i) a polynucleotide complementary to a polynucleotide of d), 
j) a polynucleotide complementary to a polynucleotide of e), and 
k) an RNA equivalent of a)-j). 

13. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
polynucleotide of claim 12. 

14. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
15 having a sequence of a polynucleotide of claim 12, the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in the sample, and 
which probe specifically hybridizes to said target polynucleotide, under conditions 
whereby a hybridization complex is formed between said probe and said target 

20 polynucleotide or fragments thereof, and 

b) detecting the presence or absence of said hybridization complex, and, optionally, if 
present, the amount thereof. 



15. A method of claim 14, wherein the probe comprises at least 60 contiguous nucleotides. 

16. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the method comprising: 

a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
reaction amplification, and 

b) detecting the presence or absence of said amplified target polynucleotide or fragment 
thereof, and, optionally, if present, the amount thereof. 
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17. A composition comprising a polypeptide of claim 1 and a pharmaceutical^ acceptable 
excipient. 

18. A composition of claim 17, wherein the polypeptide comprises an amino acid sequence 
selected from the group consisting ofSEQ ID NO: 1-31. 

19. A method for treating a disease or condition associated with decreased expression of 
functional PMMM, comprising administering to a patient in need of such treatment the composition of 
claim 17. 

20. A method of screening a compound for effectiveness as an agonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting agonist activity in the sample. 

21. A composition comprising an agonist compound identified by a method of claim 20 and a 
phaimaceutically acceptable excipient. 

22. A method for treating a disease or condition associated with decreased expression of 
functional PMMM, comprising administering to a patient in need of such treatment a composition of 
claim 21. 

23. A method of screening a compound for effectiveness as an antagonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting antagonist activity in the sample. 

24. A composition comprising an antagonist compound identified by a method of claim 23 and 
a pharmaceutically acceptable excipient. 

25. A method for treating a disease or condition associated with overexpression of functional 
PMMM, comprising administering to a patient in need of such treatment a composition of claim 24. 
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26. A method of screening for a compound that specifically binds to the polypeptide of claim 
1, the method comprising: 

a) comb ining the polypeptide of claim 1 with at least one test compound under suitable 
conditions, and 

b) detecting binding of the polypeptide of claim 1 to the test compound, thereby 
identifying a compound that specifically binds to the polypeptide of claim 1. 

27. A method of screening for a compound that modulates the activity of the polypeptide of 
claim 1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under conditions 
permissive for the activity of the polypeptide of claim 1, 

b) assessing the activity of the polypeptide of claim 1 in the presence of the test 
compound, and 

c) comparing the activity of the polypeptide of claim 1 in the presence of the test 
compound with the activity of the polypeptide of claim 1 in the absence of the test 
compound, wherein a change in the activity of the polypeptide of claim 1 in the 
presence of the test compound is indicative of a compound that modulates the activity 
of the polypeptide of claim 1. 

28. A method of screening a compound for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a sequence of claim 5, the method 
comprising: 

a) exposing a sample comprising the target polynucleotide to a compound, under 
conditions suitable for the expression of die target polynucleotide, 

b) detecting altered expression of the target polynucleotide, and 

c) comparing the expression of the target polynucleotide in the presence of varying 
amounts of the compound and in the absence of the compound. 

29. A method of assessing toxicity of a test compound, the method comprising: 

a) treating a biological sample containing nucleic acids with the test compound, 

b) hybridizing the nucleic acids of the treated biological sample with a probe comprising 
at least 20 contiguous nucleotides of a polynucleotide of claim 12 under conditions 
whereby a specific hybridization complex is formed between said probe and a target 
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polynucleotide in the biological sample, said target polynucleotide comprising a 
polynucleotide sequence of a polynucleotide of claim 12 or fragment thereof, 

c) quantifying the amount of hybridization complex, and 

d) comparing the amount of hybridization complex in the treated biological sample with 
the amount of hybridization complex in an untreated biological sample, wherein a 
difference in the amount of hybridization complex in the treated biological sample is 
indicative of toxicity of the test compound. 



30. A method for a diagnostic test for a condition or disease associated with the expression of 
10 PMMM in a biological sample, the method comprising: 

a) combining the biological sample with an antibody of claim 1 1 , under conditions suitable 
for the antibody to bind the polypeptide and form an antibodyrpolypeptide complex, 
and 

b) detecting the complex, wherein the presence of the complex correlates with the 
15 presence of the polypeptide in the biological sample. 



31. The antibody of claim 1 1, wherein the antibody is: 

a) a chimeric antibody, 

b) a single chain antibody, 
20 c) a Fab fragment, 

d) a F(ab') 2 fragment, or 

e) a humanized antibody. 



32. A composition comprising an antibody of claim 11 and an acceptable excipient. 

25 

33 . A method of diagnosing a condition or disease associated with the expression of PMMM 
in a subject, comprising administering to said subject an effective amount of the composition of claim 
32. 



30 34. A comppsition of claim 32, wherein the antibody is labeled. 
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41. A composition comprising the monoclonal antibody of claim 40 and a suitable carrier. 

42. The antibody of claim 11, wherein the antibody is produced by screening a Fab expression 

library. 

43 . The antibody of claim 1 1 , wherein the antibody is produced by screening a recombinant 
immunoglobulin library. 

44. A method of detecting a polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO:l-31 in a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with the sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) detecting specific binding, wherein specific binding indicates the presence of a 
polypeptide comprising an amino acid sequence selected from the group consisting of 
SEQ ID NO:l-31 in the sample. 

45. A method of purifying a polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO:l-31 from a sample, the method comprising: 

a) incubating the antibody of claim 11 with the sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) separating the antibody from the sample and obtaining the purified polypeptide 
comprising an amino acid sequence selected from the group consisting of SEQ ID 
NO:l-31. 

46. A microarray wherein at least one element of the microarray is a polynucleotide of claim 

13. 

47. A method of generating an expression profile of a sample which contains polynucleotides, 
the method comprising: 

a) labeling the polynucleotides of the sample, 

b) contacting the elements of the microarray of claim 46 with the labeled polynucleotides 
of the sample under conditions suitable for the formation of a hybridization complex, 
and 
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c) quantifying the expression of the polynucleotides in the sample. 

48. An array comprising different nucleotide molecules affixed in distinct physical locations 
on a solid substrate, wherein at least one of said nucleotide molecules comprises a first oligonucleotide 

5 or polynucleotide sequence specifically hybridizable with at least 30 contiguous nucleotides of a target 
polynucleotide, and wherein said target polynucleotide is a polynucleotide of claim 12. 

49. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 30 contiguous nucleotides of said target polynucleotide. 

10 

50. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 60 contiguous nucleotides of said target polynucleotide. 

51. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
15 completely complementary to said target polynucleotide. 

52. An array of claim 48, which is amicroarray. 



53. An array of claim 48, further comprising said target polynucleotide hybridized to a 
20 nucleotide molecule comprising said first oligonucleotide or polynucleotide sequence. 

54. An array of claim 48, wherein a linker joins at least one of said nucleotide molecules to 
said solid substrate. 

25 55 . An array of claim 48, wherein each distinct physical location on the substrate contains 

multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical 
location have the same sequence, and each distinct physical location on the substrate contains 
nucleotide molecules having a sequence which differs from the sequence of nucleotide molecules at 
another distinct physical location on the substrate. 

30 

56. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:l. 

57. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:2. 
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75. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:20. 

76. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:21. 

77. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:22. 

78. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:23. 

79. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:24. 

80. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:25. 

81. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:26. 

! 

15 82. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:27. 

83. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:28. 

84. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:29. 

20 

85. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:30. 

86. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:31. 

25 87. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:32. 

88. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:33. 

89. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:34. 

90. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:35. 

91. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:36. 

200 
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92. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:37. 

93. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:38. 
5 94. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:39. 

95. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:40. 

96. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:41. 

10 

97. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:42. 

98. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:43. 
15 99. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:44. 

100. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:45. 

20 101. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:46. 

102. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:47. 

25 

103. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:48. 

104. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

30 NO:49. 

105. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:50. 
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106. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:51. 

107. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

5 NO:52. 

108. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:53. 

10 109. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:54. 

110. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:55. 

15 

11 1. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:56. 

112. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

20 NO:57. 

113. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:58. 

25 114. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:59. . 

115. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:60. 

30 

116. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:61. 
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NO:62. 
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<110> INCYTE GENOMICS, INC. 
S PRAGUE, William W. 
CHAWLA, Narinder K. 
WARREN, Bridget A. 
TANG, Y. Tom 
ELLIOTT, Vicki S. 
MARQUIS, Joseph P. 
LI, Joana X. 
GRIFFIN, Jennifer A. 
GIETZEN, Kimberly J. 
YANG, Junming 
LU, Dyung Aina M. 
EMERLING, Brooke M. 
DUGGAN, Brendan M. 
RICHARDSON, Thomas W. 
LEE, Soo Yeun 
RAMKUMAR , Jayal axmi 
BECHA, Shanya D. 
LEHR-MASON, Patricia M. 
SWARNAKAR, Anita 
TRAN, Uyen K. 
KABLE, Amy E. 
HAFALIA, April J. A. 
KHARE, Reena 

<120> PROTEIN MODIFICATION AND MAINTENANCE MOLECULES 

<130> PF-1186 PCT 

<140> To Be Assigned 
<141> Herewith 

<150> US 60/322,196 
<151> 2001-09-14 

<150> US 60/324,134 
<151> 2001-09-21 

<150> US 60/327,233 
<151> 2001-10-05 

<150> US 60/346,198 
<151> 2001-10-26 

<150> US 60/343,980 
<151> 2001-11-02 

<150> US 60/348,887 
<151> 2001-11-09 

<150> US 60/332,423 
<151> 2001-11-16 

<150> US 60/334,145 
<151> 2001-11-28 
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<150> US 60/334,229 
<151> 2001-11-28 

<150> US 60/337,451 
<151> 2001-12-06 



<150> US 60/351,928 
<151> 2002-01-25 



<150> US 60/366,837 
<151> 2002-03-21 

<160> 62 

<170> PERL Program 

<210> 1 

<211> 404 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 8268274CD1 



<400> 1 

Met Glu Thr His lie Ser Cys Leu Phe Pro Glu Leu Leu Ala Met 
15 10 15 

lie Phe Gly Tyr Leu Asp Val Arg Asp Lys Gly Arg Ala Ala Gin 
20 25 30 

Val Cys Thr Ala Trp Arg Asp Ala Ala Tyr His Lys Ser Val Trp 
35 40 45 

Arg Gly Val Glu Ala Lys Leu His Leu Arg Arg Ala Asn Pro Ser 
50 55 60 

Leu Phe Pro Ser Leu Gin Ala Arg Gly lie Arg Arg Val Gin He 
65 70 75 

Leu Ser Leu Arg Arg Ser Leu Ser Tyr Val He Gin Gly Met Ala 
80 85 90 

Asn He Glu Ser Leu Asn Leu Ser Gly Cys Tyr Asn Leu Thr Asp 
95 100 105 

Asn Gly Leu Gly His Ala Phe Val Gin Glu He Gly Ser Leu Arg 
110 115 120 

Ala Leu Asn Leu Ser Leu Cys Lys Gin He Thr Asp Ser Ser Leu 
125 130 135 

Gly Arg He Ala Gin Tyr Leu Lys Gly Leu Glu Val Leu Glu Leu 
140 145 150 

Gly Gly Cys Ser Asn He Thr Asn Thr Gly Leu Leu Leu He Ala 
155 160 165 

Trp Gly Leu Gin Arg Leu Lys Ser Leu Asn Leu Arg Ser Cys Arg 
170 175 180 

His Leu Ser Asp Val Gly He Gly His Leu Ala Gly Met Thr Arg 
185 190 195 

Ser Ala Ala Glu Gly Cys Leu Gly Leu Glu Gin Leu Thr Leu Gin 
200 205 210 

Asp Cys Gin Lys Leu Thr Asp Leu Ser Leu Lys His He Ser Arg 
215 220 225 
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Gly Leu Thr Gly Leu Arg Leu Leu Asn Leu Ser Phe Cys Gly Gly 

230 235 240 

lie Ser Asp Ala Gly Leu Leu His Leu Ser His Met Gly Ser Leu 

245 250 255 

Arg Ser Leu Asn Leu Arg Ser Cys Asp Asn lie Ser Asp Thr Gly 

260 265 270 

lie Met His Leu Ala Met Gly Ser Leu Arg Leu Ser Gly Leu Asp 

275 280 285 

Val Ser Phe Cys Asp Lys Val Gly Asp Gin Ser Leu Ala Tyr lie 

290 295 300 

Ala Gin Gly Leu Asp Gly Leu Lys Ser Leu Ser Leu Cys Ser Cys 

305 310 315 

His He Ser Asp Asp Gly He Asn Arg Met Val Arg Gin Met His 

320 325 330 

Gly Leu Arg Thr Leu Asn He Gly Gin Cys Val Arg He Thr Asp 

335 340 345 

Lys Gly Leu Glu Leu lie Ala Glu His Leu Ser Gin Leu Thr Gly 

350 355 360 

He Asp Leu Tyr Gly Cys Thr Arg He Thr Lys Arg Gly Leu Glu 

365 370 375 

Arg He Thr Gin Leu Pro Cys Leu Lys Glu Ala Arg Gly Asp Phe 

380 385 390 

Ser Pro Leu Phe Thr Val Arg Thr Arg Gly Ser Ser Arg Arg 

395 400 

<210> 2 
<211> 900 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500515CD1 

<400> 2 

Met Lys Pro Pro Arg Pro Val Arg Thr Cys Ser Lys Val Leu Val 
15 10 15 

Leu Leu Ser Leu Leu Ala He His Gin Thr Thr Thr Ala Glu Lys 

20 25 30 

Asn Gly He Asp He Tyr Ser Leu Thr Val Asp Ser Arg Val Ser 

35 40 45 

Ser Arg Phe Ala His Thr Val Val Thr Ser Arg Val Val Asn Arg 

50 55 60 

Ala Asn Thr Val Gin Glu Ala Thr Phe Gin Met Glu Leu Pro Lys 

65 70 75 

Lys Ala Phe He Thr Asn Phe Ser Met He He Asp Gly Met Thr 

80 85 90 

Tyr Pro Gly He He Lys Glu Lys Ala Glu Ala Gin Ala Gin Tyr 

95 100 105 

Ser Ala Ala Val Ala Lys Gly Lys Ser Ala Gly Leu Val Lys Ala 
110 115 120 

Thr Gly Arg Asn Met Glu Gin Phe Gin Val Ser Val Ser Val Ala 
125 130 135 

Pro Asn Ala Lys He Thr Phe Glu Leu Val Tyr Glu Glu Leu Leu 
140 145 150 

Lys Arg Arg Leu Gly Val Tyr Glu Leu Leu Leu Lys Val Arg Pro 
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Gin Gin Leu 
Pro Gin Gly 
Asn Gin Leu 
Ala His He 
Pro Glu Gin 
Tyr Asp Val 
Asn Gly Tyr 
Met Pro Lys 
Ser Gly Arg 
Leu Asp Asp 
Ser Thr Glu 
Ala Glu Asn 
Ala Leu Gly 
Gin Leu Leu 
Ser Val Ser 
Gly Glu Thr 
Val Ser Gly 
Val Ser Tyr 
Leu Ala Arg 
Gin Asp Phe 
Thr Phe Glu 
Asn Phe Arg 
Lys Leu Gin 
Gly Lys Leu 
Val Ala Glu 
His Asn Phe 
Leu Leu Glu 
Leu Arg Asn 



155 

Val Lys His 

170 
He Ser Phe 

185 

Val Asp Ala 

200 
Arg Phe Lys 

215 
Gin Glu Thr 

230 
Asp Arg Ala 

245 
Phe Val His 

260 
Asn Val Val 

275 
Lys He Gin 

290 
Leu Ser Pro 

305 
Ala Thr Gin 

320 
Val Asn Lys 

335 
Gly Thr Asn 

350 
Asp Ser Ser 

365 
Leu He He 

380 
Asn Pro Arg 

395 
Arg Tyr Ser 

410 
Ala Phe Leu 

425 
Arg He His 

440 
Tyr Gin Glu 

455 
Tyr Pro Ser 

470 
Leu Leu Phe 

485 
Asp Arg Gly 

500 
Pro Thr Gin 

515 

Gin Glu Ala 

530 
Met Glu Arg 

545 
Gin Thr Val 

560 
Gin Ala Leu 



Leu Gin Met 
Leu Glu Thr 
Leu Thr Thr 
Pro Thr Leu 
Val Leu Asp 
He Ser Gly 
Tyr Phe Ala 
Phe Val He 
Gin Thr Arg 
Arg Asp Gin 
Trp Arg Pro 
Ala Arg Ser 
He Asn Asp 
Asn Gin Glu 
Leu Leu Thr 
Ser He Gin 
Leu Phe Cys 
Glu Lys Leu 
Glu Asp Ser 
Val Ala Asn 
Asn Ala Val 
Lys Gly Ser 
Pro Asp Val 
Asn He Thr 
Glu Phe Gin 
Leu Trp Ala 
Ser Ala Ser 
Asn Leu Ser 



160 

Asp He His 
175 

Glu Ser Thr 
190 

Trp Gin Asn 
205 

Ser Gin Gin 
220 

Gly Asn Leu 
235 

Gly Ser He 
250 

Pro Glu Gly 
265 

Asp Lys Ser 
280 

Glu Ala Leu 
295 

Phe Asn Leu 
310 

Ser Leu Val 
325 

Phe Ala Ala 
340 

Ala Met Leu 
355 

Glu Arg Leu 
"370 

Asp Gly Asp 
385 

Asn Asn Val 
400 

Leu Gly Phe 
415 

Ala Leu Asp 
430 

Asp Ser Ala 
445 

Pro Leu Leu 
460 

Glu Glu Val 
475 

Glu Met Val 
490 

Leu Thr Ala 
505 

Phe Gin Thr 
520 

Ser Pro Lys 
535 

Tyr Leu Thr 
550 

Asp Ala Asp 
565 

Leu Ala Tyr 



165 

He Phe Glu 
180 

Phe Met Thr 
195 

Lys Thr Lys 
210 

Gin Lys Ser 
225 

He He Arg 
240 

Gin He Glu 
255 

Leu Thr Thr 
270 

Gly Ser Met 
285 

He Lys He 
300 

He Val Phe 
315 

Pro Ala Ser 
330 

Gly He Gin 
345 

Met Ala Val 
360 

Pro Glu Gly 
375 

Pro Thr Val 
390 

Arg Glu Ala 
405 

Gly Phe Asp 
420 

Asn Gly Gly 
435 

Leu Gin Leu 
450 

Thr Ala Val 
465 

Thr Gin Asn 
480 

Val Ala Gly 
495 

Thr Val Ser 
510 

Glu Ser Ser 
525 

Tyr He Phe 
540 

He Gin Gin 
555 

Gin Gin Ala 
570 

Ser Phe Val 
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575 

Thr Pro Leu Thr Ser Met Val 
590 



580 585 
Val Thr Lys Pro Asp Asp Gin Glu 
595 600 



Gin Ser Gin Val Ala 


Glu Lys 


Pro Met Glu Gly Glu Ser Arg 


Asn 


605 






610 


615 


Arg Asn Val His Ser 


Ala Gly 


Ala Ala Gly Ser Arg Met Asn 


Phe 


620 






625 


630 


Arg Pro Gly Val Leu 


Ser 


Ser 


Arg Gin Leu Gly Leu Pro Gly 


Pro 


635 






640 


645 


Pro Asp Val Pro Asp 


His 


Ala 


Ala Tyr His Pro Phe Arg Arg 


Leu 


650 






655 


660 


Ala lie Leu Pro Ala 


Ser 


Ala 


Pro Pro Ala Thr Ser Asn Pro 


Asp 


665 






670 


675 


Pro Ala Val Ser Arg 


Val 


Met 


Asn Met Lys lie Glu Glu Thr 


Thr 


680 






685 


690 


Met Thr Thr Gin Thr 


Pro 


Ala 


Pro lie Gin Ala Pro Ser Ala 


lie 


695 






700 


705 


Leu Pro Leu Pro Gly 


Gin 


Ser 


Val Glu Arg Leu Cys Val Asp 


Pro 


710 






715 


720 


Arg His Arg Gin Gly 


Pro 


Val 


Asn Leu Leu Ser Asp Pro Glu 


Gin 


725 






730 


735 


Gly Val Glu Val Thr 


Gly Gin 


Tyr Glu Arg Glu Lys Ala Gly 


Phe 


740 






745 


750 


Ser Trp lie Glu Val 


Thr 


Phe 


Lys Asn Pro Leu Val Trp Val 


His 


755 






760 


765 


Ala Ser Pro Glu His 


Val 


Val 


Val Thr Arg Asn Arg Arg Ser 


Ser 


770 






775 


780 


Ala Tyr Lys Trp Lys 


Glu 


Thr 


Leu Phe Ser Val Met Pro Gly 


Leu 


785 






790 


795 


Lys Met Thr Met Asp 


Lys 


Thr 


Gly Leu Leu Leu Leu Ser Asp 


Pro 


800 






805 


810 


Asp Lys Val Thr lie 


Gly Leu 


Leu Phe Trp Asp Gly Arg Gly 


Glu 


815 






820 


825 


Gly Leu Arg Leu Leu 


Leu 


Arg 


Asp Thr Asp Arg Phe Ser Ser 


His 


830 






835 


840 


Val Gly Gly Thr Leu 


Gly Gin 


Phe Tyr Gin Glu Val Leu Trp 


Gly 


845 






850 


855 


Ser Pro Ala Ala Ser 


Asp 


Asp 


Gly Arg Arg Thr Leu Arg Val 


Gin 


860 






865 


870 


Gly Asn Asp His Ser 


Ala 


Thr 


Arg Glu Arg Arg Leu Asp Tyr 


Gin 


875 






880 


885 


Glu Gly Pro Pro Gly 


Val 


Glu 


lie Ser Cys Trp Ser Val Glu 


Leu 



<210> 3 

<211> 436 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 2256826CD1 

<400> 3 

Met Arg Arg Asp Val Asn Gly Val Thr Lys Ser Arg Phe Glu Met 



890 



895 



900 
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15 10 15 

Phe Ser Asn Ser Asp Glu Ala Val lie Asn Lys Lys Leu Pro Lys 
20 25 30 

Glu Leu Leu Leu Arg lie Phe Ser Phe Leu Asp Val Val Thr Leu 
35 40 45 

Cys Arg Cys Ala Gin Val Ser Arg Ala Trp Asn Val Leu Ala Leu 
50 55 60 

Asp Gly Ser Asn Trp Gin Arg He Asp Leu Phe Asp Phe Gin Arg 
65 7 0 75 

Asp He Glu Gly Arg Val Val Glu Asn He Ser Lys Arg Cys Gly 
80 85 90 

Gly Phe Leu Arg Lys Leu Ser Leu Arg Gly Cys Leu Gly Val Gly 
95 100 105 

Asp Asn Ala Leu Arg Thr Phe Ala Gin Asn Cys Arg Asn He Glu 
110 115 120 

Val Leu Asn Leu Asn Gly Cys Thr Lys Thr Thr Asp Ala Thr Cys 
125 130 135 

Thr Ser Leu Ser Lys Phe Cys Ser Lys Leu Arg His Leu Asp Leu 
140 145 150 

Ala Ser Cys Thr Ser He Thr Asn Met Ser Leu Lys Ala Leu Ser 
155 160 165 

Glu Gly Cys Pro Leu Leu Glu Gin Leu Asn He Ser Trp Cys Asp 
170 175 180 

Gin Val Thr Lys Asp Gly He Gin Ala Leu Val Arg Gly Cys Gly 
185 190 195 

Gly Leu Lys Ala Leu Phe Leu Lys Gly Cys Thr Gin Leu Glu Asp 
200 205 210 

Glu Ala Leu Lys Tyr He Gly Ala His Cys Pro Glu Leu Val Thr 
215 220 225 

Leu Asn Leu Gin Thr Cys Leu Gin He Thr Asp Glu Gly Leu He 
230 235 240 

Thr He Cys Arg Gly Cys His Lys Leu Gin Ser Leu Cys Ala Ser 
245 250 255 

Gly Cys Ser Asn He Thr Asp Ala He Leu Asn Ala Leu Gly Gin 
260 265 270 

Asn Cys Pro Arg Leu Arg He Leu Glu Val Ala Arg Cys Ser Gin 
275 280 285 

Leu Thr Asp Val Gly Phe Thr Thr Leu Ala Arg Asn Cys His Glu 
290 295 300 

Leu Glu Lys Met Asp Leu Glu Glu Cys Val Gin He Thr Asp Ser 
305 310 315 

Thr Leu He Gin Leu Ser He His Cys Pro Arg Leu Gin Val Leu 
320 325 330 

Ser Leu Ser His Cys Glu Leu He Thr Asp Asp Gly He Arg His 
335 340 345 

Leu Gly Asn Gly Ala Cys Ala His Asp Gin Leu Glu Val lie Glu 
350 355 360 

Leu Asp Asn Cys Pro Leu He Thr Asp Ala Ser Leu Glu His Leu 
365 370 375 

Lys Ser Cys His Ser Leu Glu Arg He Glu Leu Tyr Asp Cys Gin 
380 385 390 

Gin He Thr Arg Ala Gly He Lys Arg Leu Arg Thr His Leu Pro 
395 400 405 

Asn He Lys Val His Ala Tyr Phe Ala Pro Val Thr Pro Pro Pro 
410 415 420 

Ser Val Gly Gly Ser Arg Gin Arg Phe Cys Arg Cys Cys He He 
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425 



430 



435 



Leu 



<210> 4 

<211> 356 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7686186CD1 

<400> 4 

Met Glu Asn Asp Val Thr Tyr Pro Asp Pro Tyr Ser Arg Pro Ala 
1 5 10 15 

Pro Asp Arg Phe lie Arg Arg Trp Leu Val lie Thr Gly Cys lie 
20 25 30 

Ala Ala Leu Met Leu Leu Trp Gin Phe Leu Pro Ala lie Glu Ala 
35 40 45 

Trp Phe Ser Pro His Glu Thr Gin Glu Arg Thr Val Thr Pro Arg 
50 55 60 

Gly Asp Leu Ala Ala Asp Glu Lys Thr Thr lie Glu Leu Phe Glu 
65 70 75 

Lys Ser Arg Gly Ser Val Val Tyr He Thr Thr Ala Gin Leu Val 
80 85 90 

Arg Asp Val Trp Ser Arg Asn Val Phe Ser Val Pro Arg Gly Thr 
95 100 105 

Gly Ser Gly Phe He Trp Asp Asp Ala Gly His Val Val Thr Asn 

110 115 120 

Phe His Val He Gin Gly Ala Ser Ser Ala Thr Val Lys Leu Ala 

125 130 135 

Asp Gly Arg Asp Tyr Gin Ala Ala Leu Val Gly Ala Ser Pro Ala 

140 145 150 

His Asp He Ala Val Leu Lys He Gly Val Gly Phe Lys Arg Pro 

155 160 165 

Pro Ala Val Pro Val Gly Thr Ser Ala Asp Leu Lys Val Gly Gin 

170 175 180 

Lys Val Phe Ala He Gly Asn Pro Phe Gly Leu Asp Trp Thr Leu 

185 190 195 

Thr Thr Gly He Val Ser Ala Leu Asp Arg Thr Leu Ser Gly Asp 

200 205 210 

Ala Ser Gly Pro Ala He Asp His Leu He Gin Thr Asp Ala Ala 

215 220 225 

He Asn Pro Gly Asn Ser Gly Gly Pro Leu Leu Asp Ser Ala Gly 

230 235 240 

Arg Leu He Gly He Asn Thr Ala He Tyr Ser Pro Ser Gly Ala 

245 250 255 

Ser Ala Gly He Gly Phe Ala Val Pro Val Asp Thr Val Met Arg 

260 265 270 

Val Val Pro Gin Leu He Lys Thr Gly Lys Tyr He Arg Pro Ala 

275 280 285 

Leu Gly lie Glu Val Asp Glu Gin Leu Asn Ala Arg Leu Gin Ala 

290 295 300 

Leu Thr Gly Ser Lys Gly Val Phe Val Leu Arg Val Thr Pro Gly 



305 



310 



315 
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Ser Ala Ala His Arg Ala Gly Leu Val Gly Val Glu Val Thr Ala 
320 325 330 

Gly Gly lie Val Pro Gly Asp Arg Val lie Ser lie Asp Gly lie 
335 340 345 

Ala Val Asp Pro Gly lie Pro Asp Arg Thr Cys 
350 355 



<210> 5 
<211> 432 
<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 72617436CD1 



<400> 5 

Met Gly Pro Ser Ser Leu Arg Lys Thr Ser Ser Gly Leu Pro Leu 
1 5 10 . 15 

lie Leu His Tyr Gly Val lie Leu Gly Ala Pro Leu Ala Ser Ser 
20 25 30 

Cys Ala Gly Ala Cys Gly Thr Ser Phe Pro Asp Gly Leu Thr Pro 
35 40 45 

Glu Gly Thr Gin Ala Ser Gly Asp Lys Asp lie Pro Ala lie Asn 
50 55 60 

Gin Gly Leu He Leu Glu Glu Thr Pro Glu Ser Ser Phe Leu He 
65 70 75 

Glu Gly Asp He He Arg Pro Ser Pro Phe Arg Leu Leu Ser Ala 
80 85 90 

Thr Ser Asn Lys Trp Pro Met Gly Gly Ser Gly Val Val Glu Val 
95 100 105 

Pro Phe Leu Leu Ser Ser Lys Tyr Asp Glu Pro Ser Arg Gin Val 

110 115 120 

He Leu Glu Ala Leu Ala Glu Phe Glu Arg Ser Thr Cys He Arg 

125 130 135 

Phe Val Thr Tyr Gin Asp Gin Arg Asp Phe He Ser He He Pro 

140 145 150 

Met Tyr Gly Cys Phe Ser Ser Val Gly Arg Ser Gly Gly Met Gin 

155 160 165 

Val Val Ser Leu Ala Pro Thr Cys Leu Gin Lys Gly Arg Gly He 

170 175 180 

Val Leu His Glu Leu Met His Val Leu Gly Phe Trp His Glu His 

185 190 195 

Thr Arg Ala Asp Arg Asp Arg Tyr He Arg Val Asn Trp Asn Glu 

200 205 210 

He Leu Pro Gly Phe Glu He Asn Phe He Lys Ser Arg Ser Ser 

215 220 225 

Asn Met Leu Thr Pro Tyr Asp Tyr Ser Ser Val Met His Tyr Gly 

230 235 240 

Arg Leu Ala Phe Ser Arg Arg Gly Leu Pro Thr He Thr Pro Leu 

245 250 255 

Trp Ala Pro Ser Val His He Gly Gin Arg Trp Asn Leu Ser Ala 

260 265 270 

Ser Asp He Thr Arg Val Leu Gin Leu Tyr Gly Cys Ser Pro Ser 

275 280 285 

Gly Pro Arg Pro Arg Gly Arg Gly Ser His Ala His Ser Thr Gly 
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Arg Ser 
Ala Leu 
Ala Gly 
Qly Trp 
Ala Arg 
Gly Ala 
Gly Val 
Gin Pro 
Val Pro 



Pro Ala 
Ser Ala 
Gly Gin 
Glu Ser 
Gin Pro 
Gly Ala 
Ser Thr 
Val Pro 
Arg Asn 



290 
Pro 
305 
Glu 
320 
Pro 
335 
Pro 
350 
Gin 
365 
Pro 
380 
Lys 
395 
Val 
410 
His 
425 



Ala Ser 
Ser Arg 
Val Pro 
Ala Leu 
Thr Leu 
Gly Val 
Pro Thr 
Gin Gly 
Phe Lys 



Leu Ser 
Ser Pro 
Ala Gly 
Lys Lys 
Ala Ser 
Ala Gin 
Val Pro 
Ser Pro 
Gly Met 



295 

Leu Gin 
310 

Asp Pro 
325 

Pro Gly 
340 

Leu Ser 
355 

Ser Pro 
370 

Glu Gin 
385 

Ser Ser 
400 

Ala Leu 
415 

Ser Glu 
430 



Arg 
Ser 
Glu 
Ala 
Arg 
Ser 
Glu 
Pro 
Asp 



300 

Leu Leu Glu 
315 

Gly Ser Ser 
330 

Ser Pro His 
345 

Glu Ala Ser 
360 

Ser Arg Pro 
375 

Trp Leu Ala 
390 

Ala Gly He 
405 

Gly Gly Cys 
420 



<210> 6 
<211> 248 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7501945CD1 



<400> 6 



Met 


Gly 


Pro 


Ser 


Ser 


Leu 


Arg 


Lys Thr Ser Ser Gly Leu 


Pro 


Leu 


1 








5 






10 




15 


He 


Leu 


His 


Tyr 


Gly Val 


He 


Leu Gly Ala Pro Leu Ala 


Ser 


Ser 










20 






25 




30 


Cys 


Ala 


Gly Ala 


Cys 


Gly 


Thr 


Ser Phe Pro Asp Gly Leu 


Thr 


Pro 










35 






40 




45 


Glu 


Gly 


Thr 


Gin 


Ala 


Ser 


Gly 


Asp Lys Asp He Pro Ala 


He 


Asn 










50 






55 




60 


Gin 


Gly 


Leu 


He 


Leu 


Glu 


Glu 


Thr Pro Glu Ser Ser Phe 


Leu 


He 










65 






70 




75 


Glu 


Gly 


Asp 


He 


He 


Arg 


Pro 


Ser Pro Phe Arg Leu Leu 


Ser 


Ala 










80 






85 




90 


Thr 


Ser 


Asn 


Lys 


Trp 


Pro 


Met 


Gly Gly Ser Gly Val Val 


Glu 


Val 










95 






100 




105 


Pro 


Phe 


Leu 


Leu 


Ser 


Ser 


Lys 


Tyr Asp Glu Pro Ser Arg 


Gin 


Val 










110 






115 




120 


He 


Leu 


Glu 


Ala 


Leu 


Ala 


Glu 


Phe Glu Arg Ser Thr Cys 


He 


Arg 










125 






130 




135 


Phe 


Val 


Thr 


Tyr 


Gin 


Asp 


Gin 


Arg Asp Phe He Ser He 


He 


Pro 










140 






145 




150 


Met 


Tyr 


Gly 


Cys 


Phe 


Ser 


Ser 


Val Gly Arg Ser Gly Gly 


Met 


Gin 










155 






160 




165 


Val 


Val 


Ser 


Leu 


Ala 


Pro 


Thr 


Cys Leu Gin Lys Gly Arg 


Gly 


He 










170 






175 




180 


Val 


Leu 


His 


Glu 


Leu 


Met 


His 


Val Leu Gly Phe Trp His 


Glu 


His 










185 






190 




195 
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Thr Arg Ala Asp Arg Asp Arg Tyr lie His Val Asn Trp Asn Glu 

200 205 210 

lie Leu Pro Gly Phe Glu lie Asn Phe lie Lys Ser Arg Ser Ser 

215 220 225 

Asn Met Leu Thr Pro Tyr Asp Tyr Ser Ser Val Met His Tyr Gly 

230 235 240 

Arg Val Pro Cys Pro Gin His Trp 

245 

<210> 7 
<211> 388 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7500264CD1 

<400> 7 

Met Ala Ser Val Ala Gin Glu Ser Ala Gly Ser Gin Arg Arg Leu 

15 10 15 

Pro Pro Arg His Gly Ala Leu Arg Gly Leu Leu Leu Leu Cys Leu 

20 25 30 

Trp Leu Pro Ser Gly Arg Ala Ala Leu Pro Pro Ala Ala Pro Leu 

35 40 45 

Ser Glu Leu His Ala Gin Leu Ser Gly Val Glu Gin Leu Leu Glu 

50 55 60 

Glu Phe Arg Arg Gin Leu Gin Gin Glu Arg Pro Gin Glu Glu Leu 

65 70 75 

Glu Leu Glu Leu Arg Ala Gly Gly Gly Pro Gin Glu Asp Cys Pro 

80 85 90 

Gly Pro Gly Ser Gly Gly Tyr Ser Ala Met Pro Asp Ala lie He 

95 100 105 
Arg Thr Lys Asp Ser Leu Ala Ala Gly Ala Ser Phe Leu Arg Ala 

110 115 120 

Pro Ala Ala Val Arg Gly Trp Arg Gin Cys Val Ala Ala Cys Cys 

125 130 135 

Ser Glu Pro Arg Cys Ser Val Ala Val Val Glu Leu Pro Arg Arg 

140 145 150 

Pro Ala Pro Pro Ala Ala Val Leu Gly Cys Tyr Leu Phe Asn Cys 

155 160 165 

Thr Ala Arg Gly Arg Asn Val Cys Lys Phe Ala Leu His Ser Gly 

170 175 180 

Tyr Ser Ser Tyr Ser Leu Ser Arg Ala Pro Asp Gly Ala Ala Leu 

185 190 195 

Ala Thr Ala Arg Ala Ser Pro Arg Gin Glu Lys Asp Ala Pro Pro 

200 205 210 

Leu Ser Lys Ala Gly Gin Asp Val Val Leu His Leu Pro Thr, Asp 

215 220 225 

Gly Val Val Leu Asp Gly Arg Glu Ser Thr Asp Asp His Ala He 

230 235 240 

Val Gin Tyr Glu Trp Ala Leu Leu Gin Gly Asp Pro Ser Val Asp 

245 250 255 

Met Lys Val Pro Gin Ser Gly Gly Asp Ser Leu Val Glu Lys Ser 

260 265 270 

Gin Lys Ala Thr Ala Pro Asn Lys Pro Pro Ala Leu Ser Asn Thr 
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275 



280 



285 



Glu Lys Arg Asn His Ser Ala Phe Trp Gly Pro Glu Ser Gin lie 

290 295 300 

lie Pro Val Met Pro Asp Ser Ser Ser Ser Gly Lys Asn Arg Lys 

305 310 315 

Glu Glu Ser Tyr lie Phe Glu . Ser Lys Gly Asp Gly Gly Gly Gly 

320 325 , 330 

Glu His Pro Ala Pro Glu Thr Gly Ala Val Leu Pro Leu Ala Leu 

335 340 345 
Gly Leu Ala lie Thr Ala Leu Leu Leu Leu Met Val Ala Cys Arg 

350 355 360 

Leu Arg Leu Val Lys Gin Lys Leu Lys Lys Ala Arg Pro lie Thr 

365 370 375 
Ser Glu Glu Ser Asp Tyr Leu He Asn Gly Met Tyr Leu 



<210> 8 
<211> 467 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc__feature 

<223> Incyte ID No: 7499935CD1 

<400> 8 

Met Ala Ala Ala Thr Gly Pro Ser Phe Trp Leu Gly Asn Glu Thr 
15 10 15 

Leu Lys Val Pro Leu Ala Leu Phe Ala Leu Asn Arg Gin Arg Leu 
20 25 30 

Cys Glu Arg Leu Arg Lys Asn Pro Ala Val Gin Ala Gly Ser He 
35 40 45 

Val Val Leu Gin Gly Gly Glu Glu Thr Gin Arg Tyr Cys Thr Asp 
50 55 60 

Thr Gly Val Leu Phe Arg Gin Glu Ser Phe Phe His Trp Ala Phe 
65 70 75 

Gly Val Thr Glu Pro Gly Cys Tyr Gly Val He Asp Val Asp Thr 
80 85 90 

Gly Lys Ser Thr Leu Phe Val Pro Arg Leu Pro Ala Ser His Ala 
95 100 105 

Thr Trp Met Gly Lys He His Ser Lys Glu His Phe Lys Glu Lys 
110 115 120 

Tyr Ala Val Asp Asp Val Gin Tyr Val Asp Glu He Ala Ser Val 
125 130 135 

Leu Thr Ser Gin Lys Pro Ser Val Leu Leu Thr Leu Arg Gly Val 
140 145 150 

Asn Thr Asp Ser Gly Ser Val Cys Arg Glu Ala Ser Phe Asp Gly 
155 160 165 

He Ser Lys Phe Glu Val Asn Asn Thr He Leu His Pro Glu He 
170 175 180 

Val Glu Cys Arg Val Phe Lys Thr Asp Met Glu Leu Glu Val Leu 
185 190 195 

Arg Tyr Thr Asn Lys He Ser Ser Glu Ala His Arg Glu Val Met 
200 205 210 

Lys Ala Val Lys Val Gly Met Lys Glu Tyr Glu Leu Glu Ser Leu 



380 



385 



215 



220 



225 
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Phe Glu His Tyr Cys Tyr Ser Arg Gly Gly Met Arg His Ser Ser 

230 235 240 

Tyr Thr Cys He Cys Gly Ser Gly Glu Asn Ser Ala Val Leu His 

245 250 255 

Tyr Gly His Ala Gly Ala Pro Asn Asp Arg Thr He Gin Asn Gly 

260 265 270 

Asp Met Cys Leu Phe Asp Met Gly Gly Glu Tyr Tyr Cys Phe Ala 

275 280 285 

Ser Asp He Thr Cys Ser Phe Pro Ala Asn Gly Lys Phe Thr Ala 

290 295 300 

Asp Gin Lys Ala Val Tyr Glu Ala Val Leu Arg Ser Ser Arg Ala 

305 310 315 

Val Met Gly Ala Met Lys Pro Gly Val Trp Trp Pro Asp Met His 

320 325 " 330 

Arg Leu Ala Asp Arg He His Leu Glu Glu Leu Ala His Met Gly 

335 340 345 

He Leu Ser Gly Ser Val Asp Ala Met Val Gin Ala His Leu Gly 

350 355 360 

Ala Val Phe Met Pro His Gly Leu Gly His Phe Leu Gly He Asp 

365 370 375 

Val His Asp Val Gly Gly Tyr Pro Glu Gly Val Glu Arg He Tyr 

380 385 390 

Phe He Asp His Leu Leu Asp Glu Ala Leu Ala Asp Pro Ala Arg 

395 400 405 

Ala Ser Phe Leu Asn Arg Glu Val Leu Gin Arg Phe Arg Gly Phe 

410 415 ^ 420 

Gly Gly Val Arg He Glu Glu Asp Val Val Val Thr Asp Ser Gly 

425 430 ** 435 

He Glu Leu Leu Thr Cys Val Pro Arg Thr Val Glu Glu He Glu 

440 445 450 

Ala Cys Met Ala Gly Cys Asp Lys Ala Phe Thr Pro Phe Ser Gly 
455 460 465 

Pro Lys 



<210> 9 

<211> 379 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 7982285CD1 

<400> 9 

Met Glu Gly Asn Arg Asp Glu Ala Glu Lys Cys Val Glu He Ala 
1 5 10 15 

Arg Glu Ala Leu Asn Ala Gly Asn Arg Glu Lys Ala Gin Arg Phe 

20 25 ~ 30 

Leu Gin Lys Ala Glu Lys Leu Tyr Pro Leu Pro Ser Ala Arg Ala 

35 40 45 

Leu Leu Glu He He Met Lys Asn Gly Ser Thr Ala Gly Asn Ser 

50 55 60 

Pro His Cys Arg Lys Pro Ser Gly Ser Gly Asp Gin Ser Lys Pro 

g 5 70 "* 75 

Asn Cys Thr Lys Asp Ser Thr Ser Gly Ser Gly Glu Gly Gly Lys 
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80 



85 



90 



Gly Tyr Thr Lys Asp Gin Val Asp Gly Val Leu Ser lie Asn Lys 
95 100 105 

Cys Lys Asn Tyr Tyr Glu Val Leu Gly Val Thr Lys Asp Ala Gly 

110 115 120 

Asp Glu Asp Leu Lys Lys Ala Tyr Arg Lys Leu Ala Leu Lys Phe 

125 130 135 

His Pro Asp Lys Asn His Ala Pro Gly Ala Thr Asp Ala Phe Lys 

140 145 150 

Lys lie Gly Asn Ala Tyr Ala Val Leu Ser Asn Pro Glu Lys Arg 

155 160 165 

Lys Gin Tyr Asp Leu Thr Gly Asn Glu Glu Gin Ala Cys Asn His 

170 175 180 

Gin Asn Asn Gly Arg Phe Asn Phe His Arg Gly Cys Glu Ala Asp 

185 190 195 

He Thr Pro Glu Asp Leu Phe Asn He Phe Phe Gly Gly Gly Phe 

200 205 210 

Pro Ser Gly Ser Val His Ser Phe Ser Asn Gly Arg Ala Gly Tyr 

215 220 225 

Ser Gin Gin His Gin His Arg His Ser Gly His Glu Arg Glu Glu 

230 235 240 

Glu Arg Gly Asp Gly Gly Phe Ser Val Phe He Gin Leu Met Pro 

245 250 255 

He He Val Leu He Leu Val Ser Leu Leu Ser Gin Leu Met Val 

260 265 270 

Ser Asn Pro Pro Tyr Ser Leu Tyr Pro Arg Ser Gly Thr Gly Gin 

275 280 285 

Thr He Lys Met Gin Thr Glu Asn Leu Gly Val Val Tyr Tyr Val 

290 295 300 

Asn Lys Asp Phe Lys Asn Glu Tyr Lys Gly Met Leu Leu Gin Lys 

305 310 315 

Val Glu Lys Ser Val Glu Glu Asp Tyr Val Thr Asn He Arg Asn 

320 325 330 

Asn Cys Trp Lys Glu Arg Gin Gin Lys Thr Asp Met Gin Tyr Ala 

335 340 345 

Ala Lys Val Tyr Arg Asp Asp Arg Leu Arg Arg Lys Ala Asp Ala 

350 355 . 360 

Leu Ser Met Asp Asn Cys Lys Glu Leu Glu Arg Leu Thr Ser Leu 

365 370 375 

Tyr Lys Gly Gly 



<210> 10 
<211> 737 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7758505CD1 

<400> 10 

Met Gly Val Leu Lys Val Trp Leu Gly Leu Ala Leu Ala Leu Ala 

15 10 15 

Glu Phe Ala Val Leu Pro His His Ser Glu Gly Ala Cys Val Tyr 



20 



25 



30 
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Gin Asp Ser Leu Leu Ala Asp Ala Thr lie Trp Lys Pro Asp Ser 
35 40 45 

Cys Gin Ser Cys Arg Cys His Gly Asp lie Val lie Cys Lys Pro 
50 55 * 60 

Ala Val Cys Arg Asn Pro Gin Cys Ala Phe Glu Lys Gly Glu Val 
65 70 ~ 75 

Leu Gin lie Ala Ala Asn Gin Cys Cys Pro Glu Cys Val Leu Arg 
80 85 90 

Thr Pro Gly Ser Cys His His Glu Lys Lys He His Glu His Gly 
95 100 105 

Thr Glu Trp Ala Ser Ser Pro Cys Ser Val Cys Ser Cys Asn His 
HO 115 120 

Gly Glu Val Arg Cys Thr Pro Gin Pro Cys Pro Pro Leu Ser Cys 
125 130 135 

Gly His Gin Glu Leu Ala Phe He Pro Glu Gly Ser Cys Cys Pro 
140 145 iso 

Val Cys Val Gly Leu Gly Lys Pro Cys Ser Tyr Glu Gly His Val 
I 55 160 165 

Phe Gin Asp Gly Glu Asp Trp Arg Leu Ser Arg Cys Ala Lys Cys 
170 175 180 

Leu Cys Arg Asn Gly Val Ala Gin Cys Phe Thr Ala Gin Cys Gin 
185 190 " 195 

Pro Leu Phe Cys Asn Gin Asp Glu Thr Val Val Arg Val Pro Gly 
200 205 210 

Lys Cys Cys Pro Gin Cys Ser Ala Arg Ser Cys Ser Ala Ala Gly 
215 220 225 

Gin Val Tyr Glu His Gly Glu Gin Trp Ser Glu Asn Ala Cys Thr 
230 235 * 240 

Thr Cys lie Cys Asp Arg Gly Glu Val Arg Cys His Lys Gin Ala 
245 250 255 

Cys Leu Pro Leu Arg Cys Gly Lys Gly Gin Ser Arg Ala Arg Arg 
260 265 270 

His Gly Gin Cys Cys Glu Glu Cys Val Ser Pro Ala Gly Ser Cys 
275 280 285 

Ser Tyr Asp Gly Val Val Arg Tyr Gin Asp Glu Met Trp Lys Gly 
290 295 300 

Ser Ala Cys Glu Phe Cys Met Cys Asp His Gly Gin Val Thr Cys 
305 310 315 

Gin Thr Gly Glu Cys Ala Lys Val Glu Cys Ala Arg Asp Glu Glu 
320 325 330 

Leu He His Leu Asp Gly Lys Cys Cys Pro Glu Cys He Ser Arg 
335 340 345 

Asn Gly Tyr Cys Val Tyr Glu Glu Thr Gly Glu Phe Met Ser Ser 
350 355 360 

Asn Ala Ser Glu Val Lys Arg He Pro Glu Gly Glu Lys Trp Glu 
365 370 *" 375 

Asp Gly Pro Cys Lys Val Cys Glu Cys Arg Gly Ala Gin Val Thr 
380 385 390 

Cys Tyr Glu Pro Ser Cys Pro Pro Cys Pro Val Gly Thr Leu Ala 
395 400 405 

Leu Glu Val Lys Gly Gin Cys Cys Pro Asp Cys Thr Ser Val His 
410 415 420 

Cys His Pro Asp Cys Leu Thr Cys Ser Gin Ser Pro Asp His Cys 
425 430 435 

Asp Leu Cys Gin Asp Pro Thr Lys Leu Leu Gin Asn Gly Trp Cys 
440 445 " 450 
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Val His Ser Cys Gly Leu Gly Phe Tyr Gin Ala Gly Ser Leu Cys 

455 460 465 

lie Ala Cys Gin Pro Gin Cys Ser Thr Cys Thr Ser Gly Leu Glu 

470 475 480 

Cys Ser Ser Cys Gin Pro Pro Leu Leu Met Arg His Gly Gin Cys 

485 490 495 

Val Pro Thr Cys Gly Asp Gly Phe Tyr Gin Asp Arg His Ser Cys 

500 505 510 

Ala Val Cys His Glu Ser Cys Ala Gly Cys Trp Gly Pro Thr Glu 

515 520 525 

Lys His Cys Leu Ala Cys Arg Asp Pro Leu His Val Leu Arg Asp 

530 535 540 

Gly Gly Cys Glu Ser Ser Cys Gly Lys Gly Phe Tyr Asn Arg Gin 

545 550 555 

Gly Thr Cys Ser Ala Cys Asp Gin Ser Cys Asp Ser Cys Gly Pro 

560 565 570 

Ser Ser Pro Arg Cys Leu Thr Cys Thr Glu Lys Thr Val Leu His 

575 580 585 

Asp Gly Lys Cys Met Ser Glu Cys Pro Gly Gly Tyr Tyr Ala Asp 

590 595 600 

Ala Thr Gly Arg Cys Lys Val Cys His Asn Ser Cys Ala Ser Cys 

605 610 615 

Ser Gly Pro Thr Pro Ser His Cys Thr Ala Cys Ser Pro Pro Lys 

620 625 630 

Ala Leu Arg Gin Gly His Cys Leu Pro Arg Cys Gly Glu Gly Phe 

635 640 645 

Tyr Ser Asp His Gly Val Cys Lys Ala Cys His Ser Ser Cys Leu 

650 655 660 

Ala Cys Met Gly Pro Ala Pro Ser His Cys Thr Gly Cys Lys Lys 

665 670 675 

Pro Glu Glu Gly Leu Gin Val Glu Gin Leu Ser Gly Val Gly lie 

680 685 690 

Pro Ser Gly Glu Cys Leu Ala Gin Cys Arg Ala His Phe Tyr Leu 

695 700 705 

Glu Ser Thr Gly Leu Cys Glu Gly Gin Asn Leu Asp Phe Cys Gin 

710 715 720 

Asn Leu Glu Val lie Ser Ala Val Cys Leu Gly He Ser Ser Thr 

725 730 735 

Glu Asn 



<210> 11 
<211> 530 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 6885756CD1 

<400> 11 

Met Glu Asp Asp Ser Leu Tyr Leu Gly Gly Asp Trp Gin Phe Asn 
15 10 15 

His Phe Ser Lys Leu Thr Ser Ser Arg Leu Asp Ala Ala Phe Ala 
20 25 30 

Glu He Gin Arg Thr Ser Leu Ser Glu Lys Ser Pro Leu Ser Ser 
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35 40 45 

Glu Thr Arg Phe Asp Leu Cys Asp Asp Leu Ala Pro Val Ala Arg 
50 55 60 

Gin Leu Ala Pro Arg Glu Lys Leu Pro Leu Ser Ser Arg Arg Pro 
65 70 75 

Ala Ala Val Gly Ala Gly Leu Gin Lys He Gly Asn Thr Phe Tyr 
80 85 90 

Val Asn Val Ser Leu Gin Cys Leu Thr Tyr Thr Leu Pro Leu Ser 
95 100 105 

Asn Tyr Met Leu Ser Arg Glu Asp Ser Gin Thr Cys His Leu His 
110 115 120 

Lys Cys Cys Met Phe Cys Thr Met Gin Ala His He Thr Trp Ala 
125 130 ' 135 

Leu Tyr Arg Pro Gly His Val He Gin Pro Ser Gin Val Leu Ala 
140 145 150 

Ala Gly Phe His Arg Gly Glu Gin Glu Asp Ala His Glu Phe Leu 
155 160 165 

Met Phe Thr Val Asp Ala Met Lys Lys Ala Cys Leu Pro Gly His 
170 175 180 

Lys Gin Leu Asp His His Ser Lys Asp Thr Thr Leu He His Gin 
185 190 195 

He Phe Gly Ala Tyr Trp Arg Ser Gin He Lys Tyr Leu His Cys 
200 205 210 

His Gly He Ser Asp Thr Phe Asp Pro Tyr Leu Asp He Ala Leu 
215 220 225 

Asp He Gin Ala Ala Gin Ser Val Lys Gin Ala Leu Glu Gin Leu 
230 235 240 

Val Lys Pro Lys Glu Leu Asn Gly Glu Asn Ala Tyr His Cys Gly 
245 250 - 255 

Leu Cys Leu Gin Lys Ala Pro Ala Ser Lys Thr Leu Thr Leu Pro 
260 265 270 

Thr Ser Ala Lys Val Leu He Leu Val Leu Lys Arg Phe Ser Asp 
275 280 285 

Val Thr Gly Asn Lys Leu Ala Lys Asn Val Gin Tyr Pro Lys Cys 
290 295 300 

Arg Asp Met Gin Pro Tyr Met Ser Gin Gin Asn Thr Gly Pro Leu 
305 310 " 315 

Val Tyr Val Leu Tyr Ala Val Leu Val His Ala Gly Trp Ser Cys 
320 325 330 

His Asn Gly His Tyr Phe Ser Tyr Val Lys Ala Gin Glu Gly Gin 
335 340 345 

Trp Tyr Lys Met Asp Asp Ala Glu Val Thr Ala Ser Gly He Thr 
350 355 ~ 3 6 o 

Ser Val Leu Ser Gin Gin Ala Tyr Val Leu Phe Tyr He Gin Lys 
365 370 375 

Ser Glu Trp Glu Arg His Ser Glu Ser Val Ser Arg Gly Arg Glu 
380 385 " " 390 

Pro Arg Ala Leu Gly Ala Glu Asp Thr Asp Arg Pro Ala Thr Gin 
395 400 405 

Gly Glu Leu Lys Arg Asp His Pro Cys Leu Gin Val Pro Glu Leu 
410 415 420 

Asp Glu His Leu Val Glu Arg Ala Thr Gin Glu Ser Thr Leu Asp 
425 430 435 

His Trp Lys Phe Pro Gin Lys Gin Asn Lys Thr Lys Pro Glu Phe 
440 445 450 

Asn Val Arg Lys Val Glu Gly Thr Leu Pro Pro Asn Val Leu Val 
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455 460 465 

He His Gin Ser Lys Tyr Lys Cys Gly Met Lys Asn His His Pro 

470 475 480 

Glu Gin Gin Ser Ser Leu Leu Asn Leu Ser Ser Thr Lys Pro Thr 

485 ' 490 495 

Asp Gin Glu Ser Met Asn Thr Gly Thr Leu Ala Ser Leu Gin Gly 

500 505 510 

Ser Thr Arg Arg Ser Lys Gly Asn Asn Lys His Ser Lys Arg Ser 

515 520 525 

Leu Leu Val Cys Gin 

530 

<210> 12 
<211> 511 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7500748CD1 

<400> 12 

Met Ala Ala Ala Met Pro Leu Ala Leu Leu Val Leu Leu Leu Leu 
15 10 15 

Gly Pro Gly Gly Trp Cys Leu Ala Glu Pro Pro Arg Asp Ser Leu 
20 25 30 

Arg Glu Glu Leu Val He Thr Pro Leu Pro Ser Gly Asp Val Ala 
35 40 45 

Ala Thr Phe Gin Phe Arg Thr Arg Trp Asp Ser Glu Leu Gin Arg 
50 55 60 

Glu Gly Val Ser His Tyr Arg Leu Phe Pro Lys Ala Leu Gly Gin 
65 70 75 

Leu He Ser Lys Tyr Ser Leu Arg Glu Leu His Leu Ser Phe Thr 
80 85 90 

Gin Gly Phe Trp Arg Thr Arg Tyr Trp Gly Pro Pro Phe Leu Gin 
95 100 105 

Ala Pro Ser Gly Ala Glu Leu Trp Val Trp Phe Gin Asp Thr Val 
HO 115 120 

Thr Asp Val Asp Lys Ser Trp Lys Glu Leu Ser Asn Val Leu Ser 
125 130 135 

Gly He Phe Cys Ala Ser Leu Asn Phe He Asp Ser Thr Asn Thr 
140 145 150 

Val Thr Pro Thr Ala Ser Phe Lys Pro Leu Gly Leu Ala Asn Asp 
155 160 165 

Thr Asp His Tyr Phe Leu Arg Tyr Ala Val Leu Pro Arg Glu Val 
170 175 180 

Val Cys Thr Glu Asn Leu Thr Pro Trp Lys Lys Leu Leu Pro Cys 
185 190 195 

Ser Ser Lys Ala Gly Leu Ser Val Leu Leu Lys Ala Asp Arg Leu 
200 205 210 

Phe His Thr Ser Tyr His Ser Gin Ala Val His He Arg Pro Val 
215 220' 225 

Cys Arg Asn Ala Arg Cys Thr Ser He Ser Trp Glu Leu Arg Gin 
230 235 240 

Thr Leu Ser Val Val Phe Asp Ala Phe He Thr Gly Gin Gly Lys 
245 250 255 
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Lys Asp Trp Ser Leu Phe Arg Met Phe Ser Arg Thr Leu Thr Glu 

260 265 270 

Pro Cys Pro Leu Ala Ser Glu Ser Arg Val Tyr Val Asp He Thr 

275 280 ^ 285 

Thr Tyr Asn Gin Asp Asn Glu Thr Leu Glu Val His Pro Pro Pro ' 

290 295 300 

Thr Thr Thr Tyr Gin Asp Val He Leu Gly Thr Arg Lys Thr Tyr 

305 310 315 

Ala He Tyr Asp Leu Leu Asp Thr Ala Met He Asn Asn Ser Arg 

320 325 330 

Asn Leu Asn He Gin Leu Lys Trp Lys Arg Pro Pro Glu Asn Gly 

335 340 345 

Tyr He His Tyr Gin Pro Ala Gin Asp Arg Leu Gin Pro His Leu 

350 355 360 

Leu Glu Met Leu He Gin Leu Pro Ala Asn Ser Val Thr Lys Val 

365 370 375 

Ser He Gin Phe Glu Arg Ala Leu Leu Lys Trp Thr Glu Tyr Thr 

380 385 390 

Pro Asp Pro Asn His Gly Phe Tyr Val Ser Pro Ser Val Leu Ser 

395 400 405 

Ala Leu Val Pro Ser Met Val Ala Ala Lys Pro Val Asp Trp Glu 

410 415 " ' 420 

Glu Ser Pro Leu Phe Asn Ser Leu Phe Pro Val Ser Asp Gly Ser 

425 430 435 

Asn Tyr Phe Val Arg Leu Tyr Thr Glu Pro Leu Leu Val Asn Leu 

440 445 450 

Pro Thr Pro Asp Phe Ser Met Pro Tyr Asn Val He Cys Leu Thr 

455 460 465 

Cys Thr Val Val Ala Val Cys Tyr Gly Ser Phe Tyr Asn Leu Leu 

470 475 480 

Thr Arg Thr Phe His He Glu Glu Pro Arg Thr Gly Gly Leu Ala 

485 490 495 

Lys Arg Leu Ala Asn Leu He Arg Arg Ala Arg Gly Val Pro Pro 

500 505 510 

Leu 



<210> 13 

<211> 476 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500749CD1 

<400> 13 

Met Ala Ala Ala Met Pro Leu Ala Leu Leu Val Leu Leu Leu Leu 
1 5 io is 

Gly Pro Gly Gly Trp Cys Leu Ala Glu Pro Pro Arg Asp Ser Leu 

20 25 * 30 

Arg Glu Glu Leu Val He Thr Pro Leu Pro Ser Gly Asp Val Ala 

35 40 45 

Ala Thr Phe Gin Phe Arg Thr Arg Trp Asp Ser Glu Leu Gin Arg 

50 55 60 

Glu Gly Asp Thr Asp His Tyr Phe Leu Arg Tyr Ala Val Leu Pro 



18/66 



WO 03/025131 



PCT/US02/29221 



65 



70 



75 



Arg Glu Val Val Cys Thr Glu Asn Leu Thr Pro Trp Lys Lys Leu 
80 85 90 

Leu Pro Cys Ser Ser Lys Ala Gly Leu Ser Val Leu Leu Lys Ala 
95 100 105 

Asp Arg Leu Phe His Thr Ser Tyr His Ser Gin Ala Val His lie 

110 115 120 

Arg Pro Val Cys Arg Asn Ala Arg Cys Thr Ser lie Ser Trp Glu 

125 130 135 

Leu Arg Gin Thr Leu Ser Val Val Phe Asp Ala Phe lie Thr Gly 

140 145 150 

Gin Gly Lys Lys Asp Trp Ser Leu Phe Arg Met Phe Ser Arg Thr 

155 160 165 

Leu Thr Glu Pro Cys Pro Leu Ala Ser Glu Ser Arg Val Tyr Val 

170 175 180 

Asp lie Thr Thr Tyr Asn Gin Asp Asn Glu Thr Leu Glu Val His 

185 190 195 

Pro Pro Pro Thr Thr Thr Tyr Gin Asp Val lie Leu Gly Thr Arg 

200 205 210 

Lys Thr Tyr Ala lie Tyr Asp Leu Leu Asp Thr Ala Met lie Asn 

215 220 225 

Asn Ser Arg Asn Leu Asn lie Gin Leu Lys Trp Lys Arg Pro Pro 

230 235 240 

Glu Asn Glu Ala Pro Pro Val Pro Phe Leu His Ala Gin Arg Tyr 

245 250 255 

Val Ser Gly Tyr Gly Leu Gin Lys Gly Glu Leu Ser Thr Leu Leu 

260 265 270 

Tyr Asn Thr His Pro Tyr Arg Ala Phe Pro Val Leu Leu Leu Asp 

275 280 285 

Thr Val Pro Trp Tyr Leu Arg Leu Tyr Val His Thr Leu Thr lie 

290 295 300 

Thr Ser Lys Gly Lys Glu Asn Lys Pro Ser Tyr lie His Tyr Gin 

305 310 315 

Pro Ala Gin Asp Arg Leu Gin Pro His Leu Leu Glu Met Leu lie 

320 325 330 

Gin Leu Pro Ala Asn Ser Val Thr Lys Val Ser lie Gin Phe Glu 

335 340 345 

Arg Ala Leu Leu Lys Trp Thr Glu Tyr Thr Pro Asp Pro Asn His 

350 355 360 

Gly Phe Tyr Val Ser Pro Ser Val Leu Ser Ala Leu Val Pro Ser 

365 370 375 

Met Val Ala Ala Lys Pro Val Asp Trp Glu Glu Ser Pro Leu Phe 

380 385 390 

Asn Ser Leu Phe Pro Val Ser Asp Gly Ser Asn Tyr Phe Val Arg 

395 400 405 

Leu Tyr Thr Glu Pro Leu Leu Val Asn Leu Pro Thr Pro Asp Phe 

410 415 420 

Ser Met Pro Tyr Asn Val lie Cys Leu Thr Cys Thr Val Val Ala 

425 430 435 

Val Cys Tyr Gly Ser Phe Tyr Asn Leu Leu Thr Arg Thr Phe His 

440 445 450 

lie Glu Glu Pro Arg Thr Gly Gly Leu Ala Lys Arg Leu Ala Asn 

455 460 465 

Leu lie Arg Arg Ala Arg Gly Val Pro Pro Leu 



470 



475 
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<210> 14 

<211> 344 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7503401CD1 

<400> 14 

Met Ala Ala Thr Glu Gly Val Gly Qlu Ala Ala Gin Gly Gly Glu 

15 10 15 

Pro Gly Gin Pro Ala Gin Pro Pro Pro Gin Pro His Pro Pro Pro 

20 25 3 0 

Pro Gin Gin Gin His Lys Glu Glu Met Ala Ala Glu Ala Gly Glu 

35 40 45 

Ala Val Ala Ser Pro Met Asp Asp Gly Phe Val Ser Leu Asp Ser 

50 55 60 

Pro Ser Tyr Val Leu Tyr Arg Asp Arg Ala Glu Trp Ala Asp lie 

65 70 75 

Asp Pro Val Pro Gin Asn Asp Gly Pro Asn Pro Val Val Gin He 
80 85 90 

He Tyr Ser Asp Lys Phe Arg Asp Val Tyr Asp Tyr Phe Arg Ala 
95 100 io5 

Val Leu Gin Arg Asp Glu Arg Ser Glu Arg Ala Phe Lys Leu Thr 
110 115 * 120 

Arg Asp Ala He Glu Leu Asn Ala Ala Asn Tyr Thr Val Trp His 
125 130 135 

His Arg Arg Val Leu Val Glu Trp Leu Arg Asp Pro Ser Gin Glu 
140 145 15Q 

Leu Glu Phe He Ala Asp He Leu Asn Gin Asp Ala Lys Asn Tyr 
255 160 165 

His Ala Trp Gin His Arg Gin Trp Val He Gin Glu Phe Lys Leu 
170 175 180 

Trp Asp Asn Glu Leu Gin Tyr Val Asp Gin Leu Leu Lys Glu Asp 
185 190 195 

Val Arg Asn Asn Ser Val Trp Asn Gin Arg Tyr Phe Val He Ser 
200 2 o5 2io 

Asn Thr Thr Gly Tyr Asn Asp Arg Ala Val Leu Glu Arg Glu Val 
215 220 225 

Gin Tyr Thr Leu Glu Met He Lys Leu Val Pro His Asn Glu Ser 
230 235 240 

Ala Trp Asn Tyr Leu Lys Gly He Leu Gin Asp Arg Gly Leu Ser 
245 250 255 

Lys Tyr Pro Asn Leu Leu Asn Gin Leu Leu Asp Leu Gin Pro Ser 
260 2G5 270 

His Ser Ser Pro Tyr Leu He Ala Phe Leu Val Asp He Tyr Glu 
275 280 285 

Asp Met Leu Glu Asn Gin Cys Asp Asn Lys Glu Asp He Leu Asn 
290 295 300 

Lys Ala Leu Glu Leu Cys Glu He Leu Ala Lys Glu Lys Asp Thr 
305 310 315 

He Arg Lys Glu Tyr Trp Arg Tyr lie Gly Arg Ser Leu Gin Ser 
320 325 330 

Lys His Ser Thr Glu Asn Asp Ser Pro Thr Asn Val Gin Gin 
335 340 
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<210> 15 

<211> 122 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7503485CD1 

<400> 15 

Met Ser Thr Pro Ala Arg Arg Arg Leu Met Arg Asp Phe Lys Arg 

15 10 15 

Leu Gin Qlu Asp Pro Pro Ala Gly Val Ser Gly Ala Pro Ser Glu 

20 25 30 

Asn Asn lie Met Val Trp Asn Ala Val lie Phe Gly Pro Glu Gly 

35 40 45 

Thr Pro Phe Glu Asp Val Tyr Ala Asp Gly Ser lie Cys Leu Asp 

50 55 60 

lie Leu Gin Asn Arg Trp Ser Pro Thr Tyr Asp Val Ser Ser lie 

65 70 75 

Leu Thr Ser lie Gin Ser Leu Leu Asp Glu Pro Asn Pro Asn Ser 

80 85 90 

Pro Ala Asn Ser Gin Ala Ala Gin Leu Tyr Gin Glu Asn Lys Arg 

95 100 105 

Glu Tyr Glu Lys Arg Val Ser Ala lie Val Glu Gin Ser Trp Arg 

110 115 120 

Asp Cys 



<210> 16 
<211> 255 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 7504076CD1 

<400> 16 

Met Ala Val Gly Asn He Asn Glu Leu Pro Glu Asn He Leu Leu 

15 10 15 

Glu Leu Phe Thr His Val Pro Ala Arg Gin Leu Leu Leu Asn Cys 

20 25- 30 

Arg Leu Val Cys Ser Leu Trp Arg Asp Leu He Asp Leu Val Thr 

35 40 45 

Leu Trp Lys Arg Lys Cys Leu Arg Glu Gly Phe He Thr Glu Asp 

50 55 60 

Trp Asp Gin Pro Val Ala Asp Trp Lys lie Phe Tyr Phe Leu Arg 

65 70 75 

Ser Leu His Arg Asn Leu Leu His Asn Pro Cys Ala Glu Glu Gly 

80 85 90 

Phe Glu Phe Trp Ser Leu Asp Val Asn Gly Gly Asp Glu Trp Lys 

95 100 105 

Val Glu Asp Leu Ser Arg Asp Gin Arg Lys Glu Phe Pro Asn Asp 

110- 115 120 

Gin Val Lys Lys Tyr Phe Val Thr Ser Tyr Tyr Thr Cys Leu Lys 
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125 130 135 

Ser Gin Val Val Asp Leu Lys Ala Glu Gly Tyr Trp Glu Glu Leu 

14=0 145 150 

Met Asp Thr Thr Arg Pro Asp He Glu Val Lys Asp Trp Phe Ala 

155 160 165 

Ala Arg Pro Asp Cys Gly Ser Lys Tyr Gin Leu Cys Val Gin Leu 

170 175 180 

Leu Ser Ser Ala His Ala Pro Leu Gly Thr Phe Gin Pro Asp Pro 

185 190 " i 95 

Ala Thr He Gin Gin Lys Ser Asp Ala Lys Trp Arg Glu Val Ser 

200 205 210 

His Thr Phe Ser Asn Tyr Pro Pro Gly Val Arg Tyr He Trp Phe 

215 220 ~ 225 

Gin His Gly Gly Val Asp Thr His Tyr Trp Ala Gly Trp Tyr Gly 

230 235 240 

Pro Arg Val Thr Asn Ser Ser He Thr He Gly Pro Pro Leu Pro 

245 250 255 



PCT/US02/29221 



<210> 17 
<211> 166 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 7500926CD1 

<400> 17 

Met Ser Ala Trp Ala Ala Ala Ser Leu Ser Arg Ala Ala Ala Arg 

1 5 io 15 
Cys Leu Leu Ala Arg Gly Pro Gly Val Arg Ala Ala Pro Pro Arg 

20 25 30 
Asp Pro Arg Pro Ser His Pro Glu Pro Arg Gly Cys Gly Ala Ala 

35 40 45 
Pro Gly Arg Thr Leu His Phe Thr Ala Ala Val Pro Ala Gly His 

50 55 eo 

Asn Lys Trp Ser Lys Val Arg His He Lys Gly Pro Lys Asp Val 

65 70 75 

Glu Arg Ser Arg He Phe Ser Lys Leu Cys Leu Asn He Arg Leu 

80 85 90 

Ala Val Lys Glu Gly Gly Pro Asn Pro Glu His Asn Ser Asn Leu 

95 100 105 

Ala Asn He Leu Glu Val Cys Arg Ser Lys His Met Pro Lys Ser 

11° 115 120 

Thr He Glu Thr Ala Leu Lys Met Glu Lys Ser Lys Asp Thr Tyr 

125 130 ~ 135 

Leu Leu Tyr Glu Gly Arg Gly Pro Gly Gly Ser Ser Leu Leu He 

140 145 150 

Glu Ala Leu Ser Asn Ser Ser His Lys Cys Gin Ala Asp Leu Arg 

155 160 • 165 

Pro 



<210> 18 
<211> 591 
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<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7503216CD1 

<400> 18 

Met Pro Pro Lys Val Thr Ser Glu Leu Leu Arg Gin Leu Arg Gin 
15 10 15 

Ala Met Arg Asn Ser Glu Tyr Val Thr Glu Pro lie Gin Ala Tyr 

20 25 30 

lie lie Pro Ser Gly Asp Ala His Gin Ser Glu Tyr lie Ala Pro 
35 40 45 

Cys Asp Cys Arg Arg Ala Phe Val Ser Gly Phe Asp Gly Ser Ala 
50 55 60 

Gly Thr Ala lie lie Thr Glu Glu His Ala Ala Met Trp Thr Asp 
65 70 75 

Gly Arg Tyr Phe Leu Gin Ala Ala Lys Gin Met Asp Ser Asn Trp 
80 85 ' 90 

Thr Leu Met Lys Met Gly Leu Lys Asp Thr Pro Thr Gin Glu Asp 
95 100 105 

Trp Leu Val Ser Val Leu Pro Glu Gly Ser Arg Val Gly Val Asp 

110 115 120 

Pro Leu lie lie Pro Thr Asp Tyr Trp Lys Lys Met Ala Lys Val 

125 130 135 

Leu Arg Ser Ala Gly His His Leu lie Pro Val Lys Glu Asn Leu 

140 145 150 

Val Asp Lys lie Trp Thr Asp Arg Pro Glu Arg Pro Cys Lys Pro 

155 160 165 

Leu Leu Thr Leu Gly Leu Asp Tyr Thr Gly Leu Phe Asn Leu Arg 

170 175 180 

Gly Ser Asp Val Glu His Asn Pro Val. Phe Phe Ser Tyr Ala lie 

185 190 195 

lie Gly Leu Glu Thr lie Met Leu Phe He Asp Gly Asp Arg He 

200 205 210 

Asp Ala Pro Ser Val Lys Glu His Leu Leu Leu Asp Leu Gly Leu 

215 220 225 

Glu Ala Glu Tyr Arg He Gin Val His Pro Tyr Lys Ser He Leu 

230 235 240 

Ser Glu Leu Lys Ala Leu Cys Ala Asp Leu Ser Pro Arg Glu Lys 

245 250 255 

Val Trp Val Ser Asp Lys Ala Ser Tyr Ala Val Ser Glu Thr He 

260 265 270 

Pro Lys Asp His Arg Cys Cys Met Pro Tyr Thr Pro He Cys He 

275 280 285 

Ala Lys Ala Val Lys Asn Ser Ala Glu Ser Glu Gly Met Arg Arg 

290 295 300 

Ala His He Lys Asp Ala Val Ala Leu Cys Glu Leu Phe Asn Trp 

305 310 315 

Leu Glu Lys Glu Val Pro Lys Gly Gly Val Thr Glu He Ser Ala 

320 325 330 

Ala Asp Lys Ala Glu Glu Phe Arg Arg Gin Gin Ala Asp Phe Val 

335 340 345 

Asp Leu Ser Phe Pro Thr He Ser Ser Thr Gly Pro Asn Gly Ala 

350 355 360 
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He He His Tyr Ala Pro Val Pro Glu Thr Asn Arg Thr Leu Ser 
355 370 375 

Leu Asp Glu Val Tyr Leu He Asp Ser Gly Ala Gin Tyr Lys Asp 

Gly Thr Thr Asp Val Thr Arg Thr Met His Phe Gly Thr Pro Thr 
395 400 405 

Ala Tyr Glu Lys Glu Cys Phe Thr Tyr Val Leu Lys Gly His lie 
410 415 ~ 420 

Ala Val Ser Ala Ala Val Phe Pro Thr Gly Thr Lys Gly His Leu 
425 430 435 

Leu Asp Ser Phe Ala Arg Ser Ala Leu Trp Asp Ser Gly Leu Asp 
440 445 45Q 

Tyr Leu His Gly Thr Gly His Gly Val Gly Ser Phe Leu Asn Val 
455 460 465 

His Glu Gly Pro Cys Gly lie Ser Tyr Lys Thr Phe Ser Asp Glu 
470 475 P 4 80 

Pro Leu Glu Ala Gly Met He Val Thr Asp Glu Pro Gly Tyr Tyr 
485 4 9 o 495 

Glu Asp Gly Ala Phe Gly He Arg lie Glu Asn Val Val Leu Val 
500 505 510 

Val Pro Val Lys Thr Lys Tyr Asn Phe Asn Asn Arg Gly Ser Leu 
515 520 525 

Thr Phe Glu Pro Leu Thr Leu Val Pro He Gin Thr Lys Met He 
530 535 5 4o 

Asp Val Asp Ser Leu Thr Asp Lys Glu Cys Asp Trp Leu Asn Asn 
545 550 555 

Tyr His Leu Thr Cys Arg Asp Val He Gly Lys Glu Leu Gin Lys 
550 565 570 

Gin Gly Arg Gin Glu Ala Leu Glu Trp Leu He Arg Glu Thr Gin 

575 580 585 

Pro He Ser Lys Gin His 

590 

<210> 19 

<211> 652 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7503233CD1 

<400> 19 



Met Ser Glu Glu He He Thr Pro Val Tyr Cys Thr Gly Val Ser 

10 15 
Ala Gin Val Gin Lys Gin Arg Ala Arg Glu Leu Gly Leu Gly Ara 

20 25 " 3 0 

His Glu Asn Ala He Lys Tyr Leu Gly Gin Asp Tyr Glu Gin Leu 

35 40 45 

Arg Val Arg Cys Leu Gin Ser Gly Thr Leu Phe Arg Asp Glu Ala 

Phe Pro Pro Val Pro Gin Ser Leu Gly Tyr Lys Asp Leu Gly Pro 

65 70 75 

Asn Ser Ser Lys Thr Tyr Gly Tyr Ala Gly He Phe His Phe Gin 

80 85 90 

Leu Trp Gin Phe Gly Glu Trp Val Asp Val Val Val Asp Asp Leu 
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95 






100 


.105 


Leu Pro lie Lys 


Asp 


Gly Lys 


Leu Val 


Phe Val His Ser 


Ala Glu 




110 






115 


120 


Gly Asn Glu Phe 


Trp 


Ser Ala 


Leu Leu 


Glu Lys Ala Tyr 


Ala Lys 




125 






130 


135 


Val Asn Gly Ser 


Tyr 


Glu Ala 


Leu Ser 


Gly Gly Ser Thr 


Ser Glu 




140 






145 


150 


Gly Phe Glu Asp 


Phe 


Thr Gly 


Gly Val 


Thr Glu Trp Tyr 


Glu Leu 




155 






160 


165 


Arg Lys Ala Pro 


Ser 


Asp Leu 


Tyr Gin 


He He Leu Lys 


Ala Leu 




170 






175 


180 


Glu Arg Gly Ser 


Leu 


Leu Gly 


Cys Ser 


He Asp He Ser 


Ser Val 




185 






190 


195 


Leu Asp Met Glu 


Ala 


He Thr 


Phe Lys 


Lys Leu Val Lys 


Gly His 




200 






205 


210 


Ala Tyr Ser Val 


Thr 


Gly Ala 


Lys Gin 


Val Asn Tyr Arg 


Gly Gin 




215 






220 


225 


Val Val Ser Leu 


He 


Arg Met 


Arg Asn 


Pro Trp Gly Glu 


Val Glu 




230 






235 


240 


Trp Thr Gly Ala 


Trp 


Ser Asp 


Ser Ser 


Ser Glu Trp Asn 


Asn Val 




245 






250 


255 


Asp Pro Tyr Glu 


Arg 


Asp Gin 


Leu Arg 


Val Lys Met Glu 


Asp Gly 




260 






265 


270 


Glu Phe Trp Met 


Ser 


Phe Arg 


Asp Phe 


Met Arg Glu Phe 


Thr Arg 




275 






280 


285 


Leu Glu lie Cys 


Asn 


Leu Thr 


Pro Asp 


Ala Leu Lys Ser 


Arg Thr 




290 






295 


300 


lie Arg Lys Trp 


Asn 


Thr Thr 


Leu Tyr 


Glu Gly Thr Trp 


Arg Arg 




305 






310 


315 


Gly Ser Thr Ala 


Gly 


Gly Cys 


Arg Asn 


Tyr Pro Ala Thr 


Phe Trp 




320 






325 


330 


Val Asn Pro Gin 


Phe 


Lys He 


Arg Leu 


Asp Glu Thr Asp 


Asp Pro 




335 






340 


345 


Asp Asp Tyr Gly 


Asp 


Arg Glu 


Ser Gly 


Cys Ser Phe Val 


Leu Ala 




350 






355 


360 


Leu Met Gin Lys 


His 


Arg Arg 


Arg Glu 


Arg Arg Phe Gly 


Arg Asp 




365 






370 


375 


Met Glu Thr lie 


Gly 


Phe Ala 


Val Tyr 


Glu Val Pro Pro 


Glu Leu 




380 






385 


390 


Val Gly Gin Pro 


Ala 


Val His 


Leu Lys 


Arg Asp Phe Phe 


Leu Ala 




395 






400 


405 


Asn Ala Ser Arg 


Ala 


Arg Ser 


Glu Gin 


Phe He Asn Leu 


Arg Glu 




410 






415 


420 


Val Ser Thr Arg 


Phe 


Arg Leu 


Pro Pro 


Gly Glu Tyr Val 


Val Val 




425 






430 


435 


Pro Ser Thr Phe 


Glu 


Pro Asn 


Lys Glu 


Gly Asp Phe Val 


Leu Arg 




440 






445 


450 


Phe Phe Ser Glu 


Lys 


Ser Ala 


Gly Thr 


Val Glu Leu Asp 


Asp Gin 




455 






460 


465 


lie Gin Ala Asn 


Leu 


Pro Asp 


Glu Gin 


Val Leu Ser Glu 


Glu Glu 




470 






475 


480 


lie Asp Glu Asn 


Phe 


Lys Ala 


Leu Phe 


Arg Gin Leu Ala 


Gly Glu 




485 






490 


495 


Asp Met Glu lie 


Ser 


Val Lys 


Glu Leu 


Arg Thr He Leu 


Asn Arg 




500 






505 


510 


lie lie Ser Lys 


His 


Lys Asp 


Leu Arg 


Thr Lys Gly Phe 


Ser Leu 
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515 520 525 

Glu Ser Cys Arg Ser Met Val Asn Leu Met Asp Arg Asp Gly Asn 

530 535 54Q 

Gly Lys Leu Gly Leu Val Glu Phe Asn He Leu Trp Asn Arg He 

545 550 555 

Arg Asn Tyr Leu Ser He Phe Arg Lys Phe Asp Leu Asp Lys Ser 

560 565 570 

Gly Ser Met Ser Ala Tyr Glu Met Arg Met Ala He Glu Ser Ala 

575 580 5 8 5 

Gly Phe Lys Leu Asn Lys Lys Leu Tyr Glu Leu He He Thr Arg 

590 595 600 

Tyr Ser Glu Pro Asp Leu Ala Val Asp Phe Asp Asn Phe Val Cys 

605 610 615 

Cys Leu Val Arg Leu Glu Thr Met Phe Arg Phe Phe Lys Thr Leu 

620 625 630 

Asp Thr Asp Leu Asp Gly Val Val Thr Phe Asp Leu Phe Lys Trp 

635 640 645 

Leu Gin Leu Thr Met Phe Ala 

650 

<210> 20 

<211> 861 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7726576CD1 

<400> 20 

Met Ala Gly Pro Gly Pro Gly Ala Val Leu Glu Ser Pro Arg Gin 
1 5 io 15 

Leu Leu Gly Arg Val Arg Phe Leu Ala Glu Ala Ala Arg Ser Leu 
2Q 25 30 

Arg Ala Gly Arg Pro Leu Pro Ala Ala Leu Ala Phe Val Pro Arg 
35 40 45 

Glu Val Leu Tyr Lys Leu Tyr Lys Asp Pro Ala Gly Pro Ser Arg 
50 55 60 

Val Leu Leu Pro Val Trp Glu Ala Glu Gly Leu Gly Leu Arg Val 
S5 70 ~ 75 

Gly Ala Ala Gly Pro Ala Pro Gly Thr Gly Ser Gly Pro Leu Arg 
80 85 go 

Ala Ala Arg Asp Ser He Glu Leu Arg Arg Gly Ala Cys Val Arg 
95 100 105 

Thr Thr Gly Glu Glu Leu Cys Asn Gly His Gly Leu Trp Val Lys 
110 115 120 

Leu Thr Lys Glu Gin Leu Ala Glu His Leu Gly Asp Cys Gly Leu 
125 130 135 

Gin Glu Gly Trp Leu Leu Val Cys Arg Pro Ala Glu Gly Gly Ala 
140 145 iso 

Arg Leu Val Pro He Asp Thr Pro Asn His Leu Gin Arg Gin Gin 
155 160 165 

Gin Leu Phe Gly Val Asp Tyr Arg Pro Val Leu Arg Trp Glu Gin 
170 175 180 

Val Val Asp Leu Thr Tyr Ser His Arg Leu Gly Ser Arg Pro Gin 
185 190 i9 5 
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Pro Ala Glu Ala Tyr Ala Glu Ala Val Gin Arg Leu Leu Tyr Val 
200 205 210 

Pro Pro Thr Trp Thr Tyr Glu Cys Asp Glu Asp Leu lie His Phe 
215 220 225 

Leu Tyr Asp His Leu Gly Lys Glu Asp Glu Asn Leu Gly Ser Val 
230 235 240 

Lys Gin Tyr Val Glu Ser lie Asp Val Ser Ser Tyr Thr Glu Glu 
245 250 255 

Phe Asn Val Ser Cys Leu Thr Asp Ser Asn Ala Asp Thr Tyr Trp 
260 265 270 

Glu Ser Asp Gly Ser Gin Cys Gin His Trp Val Arg Leu Thr Met 
275 280 285 

Lys Lys Gly Thr He Val Lys Lys Leu Leu Leu Thr Val Asp Thr 
290 295 300 

Thr Asp Asp Asn Phe Met Pro Lys Arg Val Val Val Tyr Gly Gly 
305 310 315 

Glu Gly Asp Asn Leu Lys Lys Leu Ser Asp Val Ser He Asp Glu 
320 325 330 

Thr Leu He Gly Asp Val Cys Val Leu Glu Asp Met Thr Val His 
335 340 345 

Leu Pro He He Glu He Arg He Val Glu Cys Arg Asp Asp Gly 
350 355 360 

He Asp Val Arg Leu Arg Gly Val Lys He Lys Ser Ser Arg Gin 
365 370 375 

Arg Glu Leu Gly Leu Asn Ala Asp Leu Phe Gin Pro Thr Ser Leu 
380 385 390 

Val Arg Tyr Pro Arg Leu Glu Gly Thr Asp Pro Glu Val Leu Tyr 
395 400 405 

Arg Arg Ala Val Leu Leu Gin Arg Leu He Lys He Leu Asp Ser 
410 415 420 

Val Leu His His Leu Val Pro Ala Trp Asp His Thr Leu Gly Thr 
425 430 435 

Phe Ser Glu He Lys Gin Val Lys Gin Phe Leu Leu Leu Ser Arg 
440 445 450 

Gin Arg Pro Gly Leu Val Ala Gin Cys Leu Arg Asp Ser Glu Ser 
455 460 465 

Ser Lys Pro Ser Phe Met Pro Arg Leu Tyr He Asn Arg Arg Leu 
470 475 480 

Ala Met Glu His Arg Ala Cys Pro Ser Arg Asp Pro Ala Cys Lys 
485 490 495 

Asn Ala Val Phe Thr Gin Val Tyr Glu Gly Leu Lys Pro Ser Asp 
500 . 505 510 

Lys Tyr Glu Lys Pro Leu Asp Tyr Arg Trp Pro Met Arg Tyr Asp 
515 520 525 

Gin Trp Trp Glu Cys Lys Phe He Ala Glu Gly He He Asp Gin 
530 535 540 

Gly Gly Gly Phe Arg Asp Ser Leu Ala Asp Met Ser Glu Glu Leu 
545 550 555 

Cys Pro Ser Ser Ala Asp Thr Pro Val Pro Leu Pro Phe Phe Val 
560 565 570 

Arg Thr Ala Asn Gin Gly Asn Gly Thr Gly Glu Ala Arg Asp Met 
575 580 585 

Tyr Val Pro Asn Pro Ser Cys Arg Asp Phe Ala Lys Tyr Glu Trp 
590 595 600 

He Gly Gin Leu Met Gly Ala Ala Leu Arg Gly Lys Glu Phe Leu 
605 610 615 
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Val Leu Ala Leu Pro Gly Phe Val Trp Lys Gin Leu Ser Gly Glu 
620 625 6 3o 

Glu Val Ser Trp Ser Lys Asp Phe Pro Ala Val Asp Ser Val Leu 
635 640 645 

Val Lys Leu Leu Glu Val Met Glu Gly Met Asp Lys Glu Thr Phe 
650 655 660 

Glu Phe Lys Phe Gly Lys Glu Leu Thr Phe Thr Thr Val Leu Ser 

665 670 675 

Asp Gin Gin Val Val Glu Leu lie Pro Gly Gly Ala Gly He Val 

680 685 4 690 

Val Gly Tyr Gly Asp Arg Ser Arg Phe He Gin Leu Val Gin Lys 

695 700 705 

Ala Arg Leu Glu Glu Ser Lys Glu Gin Val Ala Ala Met Gin Ala 

710 715 720 

Gly Leu Leu Lys Val Val Pro Gin Ala Val Leu Asp Leu Leu Thr 

725 730 735 

Trp Gin Glu Leu Glu Lys Lys Val Cys Gly Asp Pro Glu Val Thr 

740 745 750 

Val Asp Ala Leu Arg Lys Leu Thr Arg Phe Glu Asp Phe Glu Pro 

755 760 765 

Ser Asp Ser Arg Val Gin Tyr Phe Trp Glu Ala Leu Asn Asn Phe 

770 775 780 

Thr Asn Glu Asp Arg Ser Arg Val Leu Arg Phe Val Thr Gly Arg 

"785 790 795 

Ser Arg Leu Pro Ala Arg He Tyr He Tyr Pro Asp Lys Leu Gly 

800 805 810 

Tyr Glu Thr Thr Asp Ala Leu Pro Glu Ser Ser Thr Cys Ser Ser 

815 820 825 

Thr Leu Phe Leu Pro His Tyr Ala Ser Ala Lys Val Cys Glu Glu 

830 835 " 840 

Lys Leu Arg Tyr Ala Ala Tyr Asn Cys Val Ala He Asp Thr Asp 

845 850 855 

Met Ser Pro Trp Glu Glu 

860 

<210> 21 

<211> 447 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7503507CD1 

<400> 21 

Met Ala Ser Val Ala Gin Glu Ser Ala Gly Ser Gin Arg Arg Leu 
1 5 io 15 

Pro Pro Arg His Gly Ala Leu Arg Gly Leu Leu Leu Leu Cys Leu 

20 25 ~ 30 

Trp Leu Pro Ser Gly Arg Ala Ala Leu Pro Pro Ala Ala Pro Leu 

35 40 45 

Ser Glu Leu His Ala Gin Leu Ser Gly Val Glu Gin Leu Leu Glu 

50 55 60 

Glu Phe Arg Arg Gin Leu Gin Gin Glu Arg Pro Gin Glu Glu Leu 

65 70 75 

Glu Leu Glu Leu Arg Ala Gly Gly Gly Pro Gin Glu Asp Cys Pro 
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80 85 90 

Gly Arg Gly Ser Gly Gly Tyr Ser Ala Met Pro Asp Ala lie lie 
95 100 105 

Arg Thr Lys Asp Ser Leu Ala Ala Gly Ala Ser Phe Leu Arg Ala 
HO 115 120 

Pro Ala Ala Val Arg Gly Trp Arg Gin Cys Val Ala Ala Cys Cys 
125 130 135 

Ser Glu Pro Arg Cys Ser Val Ala Val Val Glu Leu Pro Arg Arg 
140 145 150 

Pro Ala Pro Pro Ala Ala Val Leu Gly Cys Tyr Leu Phe Asn Cys 
155 160 165 

Thr Ala Arg Gly Arg Asn Val Cys Lys Phe Ala Leu His Ser Gly 
170 175 180 

Tyr Ser Ser Tyr Ser Leu Ser Arg Ala Pro Asp Gly Ala Ala Leu 
185 190 195 

Ala Thr Ala Arg Ala Ser Pro Arg Gin Glu Lys Asp Ala Pro Pro 
200 205 210 

Leu Ser Lys Ala Gly Gin Asp Val Val Leu His Leu Pro Thr Asp 
215 220 225 

Gly Val Val Leu Asp Gly Arg Glu Ser Thr Asp Asp His Ala lie 
230 235 240 

Val Gin Tyr Glu Trp Ala Leu Leu Gin Gly Asp Pro Ser Val Asp 
245 250 * 255 

Met Lys Val Pro Gin Ser Gly Thr Leu Lys Leu Ser His Leu Gin 
260 265 270 

Glu Gly Thr Tyr Thr Phe Gin Leu Thr Val Thr Asp Thr Ala Gly 
27 5 280 285 

Gin Arg Ser Ser Asp Asn Val Ser Val Thr Val Leu Arg Ala Ala 
290 295 300 

Tyr Ser Thr Gly Gly Cys Leu His Thr Cys Ser Arg Tyr His Phe 
305 310 315 

Phe Cys Asp Asp Gly Cys Cys He Asp He Thr Leu Ala Cys Asp 
320 325 330 

Gly Val Gin Gin Cys Pro Asp Gly Ser Asp Glu Asp Phe Cys Gin 
335 340 345 

Asn Leu Gly Leu Asp Arg Lys Met Val Thr His Thr Ala Ala Ser 
350 355 360 

Pro Ala Leu Pro Arg Thr Thr Gly Pro Ser Glu Asp Ala Gly Gly 
365 370 375 

Asp Ser Leu Val Glu Lys Ser Gin Lys Ala Thr Ala Pro Asn Lys 
380 385 390 

Pro Pro Ala Leu Ser Asn Thr Glu Lys Arg Lys Val He Tyr Leu 
395 400 405 

Ser Gin Arg Val Met Glu Glu Glu Gly Asn Thr Gin Pro Gin Lys 
410 415 420 

Gin Val Gin Cys Tyr Pro Trp Arg Trp Val Trp Leu Ser Leu Leu 
425 430 • 435 

Cys Cys Phe Ser Trp Leu His Ala Asp Tyr Asp Trp 
440 445 

<210> 22 
<211> 468 
<212> PRT 

<213> Homo sapiens 
<220> 
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<221> inisc_feature 

<223> Incyte ID No: 7503506CD1 

<400> 22 

Met Ala Ser Val Ala Gin Glu Ser Ala Gly Ser Gin Arg Arg Leu 
1 5 io 15 

Pro Pro Arg His Gly Ala Leu Arg Gly Leu Leu Leu Leu Cys Leu 
20 25 30 

Trp Leu Pro Ser Gly Arg Ala Ala Leu Pro Pro Ala Ala Pro Leu 
35 40 45 

Ser Glu Leu His Ala Gin Leu Ser Gly Val Glu Gin Leu Leu Glu 
50 55 eo 

Glu Phe Arg Arg Gin Leu Gin Gin Glu Arg Pro Gin Glu Glu Leu 
65 70 75 

Glu Leu Glu Leu Arg Ala Gly Gly Gly Pro Gin Glu Asp Cys Pro 
80 85 ~ 90 

Gly Arg Gly Ser Gly Gly Tyr Ser Ala Met Pro Asp Ala He He 
95 100 105 

Arg Thr Lys Asp Ser Leu Ala Ala Gly Ala Ser Phe Leu Arg Ala 
110 115 120 

Pro Ala Ala Val Arg Gly Trp Arg Gin Cys Val Ala Ala Cys Cys 
125 130 135 

Ser Glu Pro Arg Cys Ser Val Ala Val Val Glu Leu Pro Arg Arg 
140 145 15Q 

Pro Ala Pro Pro Ala Ala Val Leu Gly Cys Tyr Leu Phe Asn Cys 
t ^ 155 160 165 

Thr Ala Arg Gly Arg Asn Val Cys Lys Phe Ala Leu His Ser Gly 
I 70 175 leo 

Tyr Ser Ser Tyr Ser Leu Ser Arg Ala Pro Asp Gly Ala Ala Leu 
185 190 i 95 

Ala Thr Ala Arg Ala Ser Pro Arg Gin Glu Lys Asp Ala Pro Pro 
200 205 210 

Leu Ser Lys Ala Gly Gin Asp Val Val Leu His Leu Pro Thr Asp 
215 220 225 

Gly Val Val Leu Asp Gly Arg Glu Ser Thr Asp Asp His Ala He 
230 235 240 

Val Gin Tyr Glu Trp Ala Leu Leu Gin Gly Asp Pro Ser Val Asp 
245 250 255 

Met Lys Val Pro Gin Ser Gly Thr Leu Lys Leu Ser His Leu Gin 
260 265 270 

Glu Gly Thr Tyr Thr Phe Gin Leu Thr Val Thr Asp Thr Ala Gly 
275 280 285 

Gin Arg Ser Ser Asp Asn Val Ser Val Thr Val Leu Arg Ala Ala 
290 295 300 

Tyr Ser Thr Gly Gly Cys Leu His Thr Cys Ser Arg Tyr His Phe 
305 310 315 

Phe Cys Asp Asp Gly Cys Cys He Asp He Thr Leu Ala Cys Asp 
320 325 330 

Gly Val Gin Gin Cys Pro Asp Gly Ser Asp Glu Asp Phe Cys Gin 
335 340 345 

Asn Leu Gly Leu Asp Arg Lys Met Val Thr His Thr Ala Ala Ser 
350 355 360 

Pro Ala Leu Pro Arg Thr Thr Gly Pro Ser Glu Asp Ala Gly Gly 
365 370 375 

Asp Ser Leu Val Glu Lys Ser Gin Lys Ala Thr Ala Pro Asn Lys 
380 385 390 
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Pro Pro Ala Leu Ser Asn Thr Glu Lys Arg Asn His Ser Ala Phe 

395 400 405 

Trp Gly Pro Glu Ser Gin lie lie Pro Val Met Pro Gly Ala Val 

410 415 . 420 

Leu Pro Leu Ala Leu Gly Leu Ala lie Thr Ala Leu Leu Leu Leu 

425 4,30 435 

Met Val Ala Cys Arg Leu Arg Leu Val Lys Gin Lys Leu Lys Lys 

440 445 450 
Ala Arg Pro lie Thr Ser Glu Glu Ser Asp Tyr Leu lie Asn Gly 

455 460 465 

Met Tyr Leu 



<210> 23 
<211> 236 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7503509CD1 

<400> 23 

Met Ala Ser Val Ala Gin Glu Ser Ala Gly Ser Gin Arg Arg Leu 
15 10 15 

Pro Pro Arg His Gly Ala Leu Arg Gly Leu Leu Leu Leu Cys Leu 
20 25 30 

Trp Leu Pro Ser Gly Arg Ala Ala Leu Pro Pro Ala Ala Pro Leu 
35 40 45 

Ser Glu Leu His Ala Gin Leu Ser Gly Val Glu Gin Leu Leu Glu 
50 55 60 

Glu Phe Arg Arg Gin Leu Gin Gin Glu Arg Pro Gin Glu Glu Leu 
65 70 75 

Glu Leu Glu Leu Arg Ala Gly Gly Gly Pro Gin Glu Asp Cys Pro 
80 85 90 

Gly Pro Gly Ser Gly Gly Tyr Ser Ala Met Pro Asp Ala lie He 
95 100 105 

Arg Thr Lys Asp Ser Leu Ala Ala Gly Ala Ser Phe Leu Arg Ala 
110 115 120 

Pro Ala Ala Val Arg Gly Trp Arg Gin Cys Val Ala Ala Cys Cys 
125 130 135 

Ser Glu Pro Arg Cys Ser Val Ala Val Val Glu Leu Pro Arg Arg 
140 145 150 

Pro Ala Pro Pro Ala Ala Val Leu Gly Cys Tyr Leu Phe Asn Cys 
155 160 165 

Thr Ala Arg Gly Arg Asn Val Cys Lys Phe Ala Leu His Ser Gly 
170 175 180 

Tyr Ser Ser Tyr Ser Leu Ser Arg Ala Pro Asp Gly Ala Ala Leu 
185 190 195 

Ala Thr Ala Arg Ala Ser Pro Arg Gin Gly Ala Ser He Arg Asn 
200 205 210 

Pro Glu Ala Val Pro Pro Thr Gly Gly Asn Leu His Leu Pro Ala 
215 220 225 

Asp Arg Asp Gly His Cys Arg Ala Glu Lys Leu 
230 235 
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<210> 24 
<211> 312 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7505800CD1 

<400> 24 

Met Ala Ala Thr Glu Gly Val Gly Glu Ala Ala Gin Gly Gly Glu 
1 . 5 io is 

Pro Gly Gin Pro Ala Gin Pro Pro Pro Gin Pro His Pro Pro Pro 
20 25 30 

Pro Gin Gin Gin His Lys Glu Glu Met Ala Ala Glu Ala Gly Glu 
35 40 45 

Ala Val Ala Ser Pro Met Asp Asp Gly Phe Val Ser Leu Asp Ser 
50 55 60 

Pro Ser Tyr Val Leu Tyr Arg His Phe Arg Arg Val Leu Leu Lys 
65 70 75 

Ser Leu Gin Lys Asp Leu His Glu Glu Met Asn Tyr He Thr Ala 
80 85 go 

He He Glu Glu Gin Pro Lys Asn Tyr Gin Val Trp His His Arg 
95 100 105 

Arg Val Leu Val Glu Trp Leu Arg Asp Pro Ser Gin Glu Leu Glu 
HO 115 120 

Phe He Ala Asp He Leu Asn Gin Asp Ala Lys Asn Tyr His Ala 
125 130 135 

Trp Gin His Arg Gin Trp Val He Gin Glu Phe Lys Leu Trp Asp 
140 145 150 

Asn Glu Leu Gin Tyr Val Asp Gin Leu Leu Lys Glu Asp Val Arg 
155 160 165 

Asn Asn Ser Val Trp Asn Gin Arg Tyr Phe Val He Ser Asn Thr 
170 175 180 

Thr Gly Tyr Asn Asp Arg Ala Val Leu Glu Arg Glu Val Gin Tyr 
185 190 195 

Thr Leu Glu Met He Lys Leu Val Pro His Asn Glu Ser Ala Trp 
200 205 210 

Asn Tyr Leu Lys Gly He Leu Gin Asp Arg Gly Leu Ser Lys Tyr 
215 220 225 

Pro Asn Leu Leu Asn Gin Leu Leu -Asp Leu Gin Pro Ser His Ser 
230 235 240 

Ser Pro Tyr Leu He Ala Phe Leu Val Asp He Tyr Glu Asp Met 
245 250 255 

Leu Glu Asn Gin Cys Asp Asn Lys Glu Asp He Leu Asn Lys Ala 
260 265 270 

Leu Glu Leu Cys Glu He Leu Ala Lys Glu Lys Asp Thr He Arg 
275 280 285 

Lys Glu Tyr Trp Arg Tyr He Gly Arg Ser Leu Gin Ser Lys His 
290 295 300 

Ser Thr Glu Asn Asp Ser Pro Thr Asn Val Gin Gin 
305 310 

<210> 25 
<211> 452 
<212> PRT 
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<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7503141CD1 



<400> 25 

Met Ala Ala Ala Thr Gly Pro Ser Phe .Trp Leu Gly Asn Glu Thr 
1 5 10 15 

Leu Lys Val Pro Leu Ala Leu Phe Ala Leu Asn Arg Gin Arg Leu 
20 25 30 

Cys Glu Arg Leu Arg Lys Asn Pro Ala Val Gin Ala Gly Ser lie 
35 40 45 

Val Val Leu Gin Gly Gly Glu Glu Thr Gin Arg Tyr Cys Thr Asp 
50 55 60 

Thr Gly Val Leu Phe Arg Gin Glu Ser Phe Phe His Trp Ala Phe 
65 70 75 

Gly Val Thr Glu Pro Gly Cys Tyr Gly Val lie Asp Val Asp Thr 
80 85 90 

Gly Lys Ser Thr Leu Phe Val Pro Arg Leu Pro Ala Ser His Ala 
95 100 105 

Thr Trp Met Gly Lys He His Ser Lys Glu His Phe Lys Glu Lys 
110 115 120 

Tyr Ala Val Asp Asp Val Gin Tyr Val Asp Glu He Ala Ser Val 
125 130 - 135 

Leu Thr Ser Gin Lys Pro Ser Val Leu Leu Thr Leu Arg Gly Val 
140 145 150 

Asn Thr Asp Ser Gly Ser Val Cys Arg Glu Ala Ser Phe Asp Gly 
155 160 165 

He Ser Lys Phe Glu Val Asn Asn Thr He Leu His Pro Glu He 
170 175 180 

Val Glu Cys Leu Phe Glu His Tyr Cys Tyr Ser Arg Gly Gly Met 
185 190 195 

Arg His Ser Ser Tyr Thr Cys He Cys Gly Ser Gly Glu Asn Ser 
200 205 210 

Ala Val Leu His Tyr Gly His Ala Gly Ala Pro Asn Asp Arg Thr 
215 220 225 

He Gin Asn Gly Asp Met Cys Leu Phe Asp Met Gly Gly Glu Tyr 
230 235 240 

Tyr Cys Phe Ala Ser Asp He Thr Cys Ser Phe Pro Ala Asn Gly 
245 250 255 

Lys Phe Thr Ala Asp Gin Lys Ala Val Tyr Glu Ala Val Leu Arg 
260 265 270 

Ser Ser Arg Ala Val Met Gly Ala Met Lys Pro Gly Val Trp Trp 
275 280 285 

Pro Asp Met His Arg Leu Ala Asp Arg He His Leu Glu Glu Leu 
290 295 300 

Ala His Met Gly He Leu Ser Gly Ser Val Asp Ala Met Val Gin 
305 310 315 

Ala His Leu Gly Ala Val Phe Met Pro His Gly Leu Gly His Phe 
320 325 330 

Leu Gly He Asp Val His Asp Val Gly Gly Tyr Pro Glu Gly Val 
335 340 345 

Glu Arg He Asp Glu Pro Gly Leu Arg Ser Leu Arg Thr Ala Arg 
350 355 360 

His Leu Gin Pro Gly Met Val Leu Thr Val Glu Pro Gly He Tyr 
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365 


370 


375 


Phe He Asp His 


Leu Leu 


Asp Glu Ala Leu Ala Asp Pro 


Ala Arg 




380 


385 


390 


Ala Ser Phe Leu 


Asn Arg 


Glu Val Leu Gin Arg Phe Arg 


Gly Phe 




395 


400 


405 


Gly Gly Val Arg 


He Glu 


Glu Asp Val Val Val Thr Asp 


Ser Gly 




410 


415 


420 


He Glu Leu Leu 


Thr Cys 


Val Pro Arg Thr Val Glu Glu 


He Glu 




425 


430 


435 


Ala Cys Met Ala 


Gly Cys 


Asp Lys Ala Phe Thr Pro Phe 


Ser Gly 




440 


445 


450 



Pro Lys 



<210> 26 
<211> 471 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500362CD1 

<400> 26 

Met Ala Ala Ala Thr Gly Pro Ser Phe Trp Leu Gly Asn Glu Thr 
15 10 15 

Leu Lys Val Pro Leu Ala Leu Phe Ala Leu Asn Arg Gin Arg Leu 
20 25 30 

Cys Glu Arg Leu Arg Lys Asn Pro Ala Val Gin Ala Gly Ser He 
35 40 45 

Val Ser Phe Phe His Trp Ala Phe Gly Val Thr Glu Pro Gly Cys 
50 55 60 

Tyr Gly Val He Asp Val Asp Thr Gly Lys Ser Thr Leu Phe Val 
65 70 75 

Pro Arg Leu Pro Ala Ser His Ala Thr Trp Met Gly Lys He His 
80 85 90 

Ser Lys Glu His Phe Lys Glu Lys Tyr Ala Val Asp Asp Val Gin 
95 100 105 

Tyr Val Asp Glu He Ala Ser Val Leu Thr Ser Gin Lys Pro Ser 
110 115 120 

Val Leu Leu Thr Leu Arg Gly Val Asn Thr Asp Ser Gly Ser Val 
125 130 135 

Cys Arg Glu Ala Ser Phe Asp Gly He Ser Lys Phe Glu Val Asn 
140 145 150 

Asn Thr He Leu His Pro Glu He Val Glu Cys Arg Val Phe Lys 
155 160 165 

Thr Asp Met Glu Leu Glu Val Leu Arg Tyr Thr Asn Lys He Ser 
170 175 180 

Ser Glu Ala His Arg Glu Val Met Lys Ala Val Lys Val Gly Met 
185 190 195 

Lys Glu Tyr Glu Leu Glu Ser Leu Phe Glu His Tyr Cys Tyr Ser 
200 205 210 

Arg Gly Gly Met Arg His Ser Ser Tyr Thr Cys He Cys Gly Ser 
215 220 225 

Gly Glu Asn Ser Ala Val Leu His Tyr Gly His Ala Gly Ala Pro 
230 235 240 
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Asn Asp Arg Thr He Gin Asn Gly Asp Met Cys Leu Phe Asp Met 

245 250 255 

Gly Gly Glu Tyr Tyr Cys Phe Ala Ser Asp He Thr Cys Ser Phe 

260 265 270 

Pro Ala Asn Gly Lys Phe Thr Ala Asp Gin Lys Ala Val Tyr Glu 

275 280 285 

Ala Val Leu Arg Ser Ser Arg Ala Val Met Gly Ala Met Lys Pro 

290 295 300 

Gly Val Trp Trp Pro Asp Met His Arg . Leu Ala Asp Arg He His 

305 310 315 

Leu Glu Glu Leu Ala His Met Gly He Leu Ser Gly Ser Val Asp 

320 325 330 

Ala Met Val Gin Ala His Leu Gly Ala Val Phe Met Pro His Gly 

335 340 345 

Leu Gly His Phe Leu Gly He Asp Val His Asp Val Gly Gly Tyr 

350 355 360 

Pro Glu Gly Val Glu Arg He Asp Glu Pro Gly Leu Arg Ser Leu 

365 370 375 

Arg Thr Ala Arg His Leu Gin Pro Gly Met Val Leu Thr Val Glu 

380 385 390 

Pro' Gly He Tyr Phe lie Asp His Leu Leu Asp Glu Ala Leu Ala 

395 400 405 

Asp Pro Ala Arg Ala Ser Phe Leu Asn Arg Glu Val Leu Gin Arg 

410 415 420 

Phe Arg Gly Phe Gly Gly Val Arg He Glu Glu Asp Val Val Val 

425 430 435 

Thr Asp Ser Gly He Glu Leu Leu Thr Cys Val Pro Arg Thr Val 

440 445 450 

Glu Glu He Glu Ala Cys Met - Ala Gly Cys Asp Lys Ala Phe Thr 

455 460 465 

Pro Phe Ser Gly Pro Lys 

470 



<210> 27 
<211> 458 
<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7503328CD1 



<400> 27 



Met 


Ala Ala 


Ser 


Arg Lys 


Pro 


Pro Arg Val Arg Val Asn 


His 


Gin 


1 






5 




10 




15 


Asp 


Phe Gin 


Leu 


Arg Asn 


Leu 


Arg He He Glu Pro Asn 


Glu 


Val 






20 




25 




30 


Thr 


His Ser 


Gly 


Asp Thr 


Gly 


Val Glu Thr Asp Gly Arg 


Met 


Pro 








35 




40 




45 


Pro 


Lys Val 


Thr 


Ser Glu 


Leu 


Leu Arg Gin Leu Arg Gin 


Ala 


Met 






50 




55 




60 


Arg 


Asn Ser 


Glu 


Tyr Val 


Thr 


Glu Pro He Gin Ala Tyr 


He 


He 






65 




70. 




75 


Pro 


Ser Gly 


Asp 


Ala His 


Gin 


Ser Glu Tyr He Ala Pro 


Cys 


Asp 






80 




85 




90 


Cys 


Arg Arg 


Ala 


Phe Val 


Ser 


Gly Phe Asp Gly Ser Ala Gly Thr 
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95 100 105 

Ala lie lie Thr Glu Glu His Ala Ala Met Trp Thr Asp Gly Arg 

110 115 120 

Tyr Phe Leu Gin Ala Ala Lys Gin Met Asp Ser Asn Trp Thr Leu 

125 130 135 

Met Lys Met Gly Leu Lys Asp Thr Pro Thr Gin Glu Asp Trp Leu 

140 145 150 

Val Ser Val Leu Pro Glu Gly Ser Arg Val Gly Val Asp Pro Leu 

155 160 165 

lie lie Pro Thr Asp Tyr Trp Lys Lys Met Ala Lys Val Leu Arg 

170 175 180 

Ser Ala Gly His His Leu He Pro Val Lys Glu Asn Leu Val Asp 

185 190 195 

Lys He Trp Thr Asp Arg Pro Glu Arg Pro Cys Lys Pro Leu Leu 

200 205 210 

Thr Leu Gly Leu Asp Tyr Thr Gly He Ser Trp Lys Asp Lys Val 

215 220 225 

Ala Asp Leu Arg Leu Lys Met Ala Glu Arg Asn Val Met Trp Phe 

230 235 240 

Val Val Thr Ala Leu Asp Glu He Ala Trp Leu Phe Asn Leu Arg 

245 250 255 

Gly Ser Asp Val Glu His Asn Pro Val Phe Phe Ser Tyr Ala He 

260 265 270 

He Gly Leu Glu Thr He Met Leu Phe He Asp Gly Asp Arg He 

275 280 285 

Asp Ala Pro Ser Val Lys Glu His Leu Leu Leu Asp Leu Gly Leu 

290 295 300 

Glu Ala Glu Tyr Arg He Gin Val His Pro Tyr Lys Ser He Leu 

305 310 315 

Ser Glu Leu Lys Ala Leu Cys Ala Asp Leu Ser Pro Arg Glu Lys 

320 325 330 

Val Trp Val Ser Asp Lys Ala Ser Tyr Ala Val Ser Glu Thr He 

335 340 345 

Pro Lys Asp His Arg Cys Cys Met Pro Tyr Thr Pro He Cys He 

350 355 360 

Ala Lys Ala Val Lys Asn Ser Ala Glu Ser Glu Gly Met Arg Arg 

355 370 375 

Ala His He Lys Asp Ala Val Ala Leu Cys Glu Leu Phe Asn Trp 

380 385 390 

Leu Glu Lys Glu Val Pro Lys Gly Gly Val Thr Glu He Ser Ala 

395 400 405 

Ala Asp Lys Ala Glu Glu Phe Arg Arg Gin Gin Ala Asp Phe Val 

410 415 420 

Asp Leu Ser Phe Pro Thr He Ser Ser Gin Ser Leu Arg Arg He 

425 430 435 

Gly Pro Cys Pro Trp Met Arg Cys Thr Leu Leu Thr Arg Val Leu 

440 445 450 ' 

Asn Thr Arg Met Ala Pro Gin Met 

455 

<210> 28 

<211> 695 

<212> PRT 

<213> Homo sapiens 

<220> 
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<221> misc_f eature 

<223> Incyte ID No: 7510464CD1 



<400> 28 

Met Ala Ala Ser Arg Lys Pro Pro Arg Val Arg Val Asn His Gin 
15 10 15 

Asp Phe Gin Leu Arg Asn Leu Arg He He Glu Pro Asn Glu Val 

20 25 30 

Thr His Ser Gly Asp Thr Gly Val Glu Thr Asp Gly Arg Met Pro 

35 40 45 

Pro Lys Val Thr Ser Glu Leu Leu Arg Gin Leu Arg Gin Ala Met 

50 55 60 

Arg Asn Ser Glu Tyr Val Thr Glu Pro He Gin Ala Tyr He He 
65 70 75 

Pro Ser Gly Asp Ala His Gin Ser Glu Tyr He Ala Pro Cys Asp 
80 85 90 

Cys Arg Arg Ala Phe Val Ser Gly Phe Asp Gly Ser Ala Gly Thr 
95 100 105 

Ala He He Thr Glu Glu His Ala Ala Met Trp Thr Asp Gly Arg 

110 115 120 

Tyr Phe Leu Gin Ala Ala Lys Gin Met Asp Ser Asn Trp Thr Leu 

125 130 135 

Met Lys Met Gly Leu Lys Asp Thr Pro Thr Gin Glu Asp Trp Leu 

140 145 150 

Val Ser Val Leu Pro Glu Gly Ser Arg Val Gly Val Asp Pro Leu 

155 160 165 

He He Pro Thr Asp Tyr Trp Lys Lys Met Ala Lys Val Leu Arg 

170 175 180 

Ser Ala Gly His His Leu He Pro Val Lys Glu Asn Leu Val Asp 

185 190 195 

Lys He Trp Thr Asp Arg Pro Glu Arg Pro Cys Lys Pro Leu Leu 

200 205 210 

Thr Leu Gly Leu Asp Tyr Thr Gly He Ser Trp Lys Asp Lys Val 

215 220 225 

Ala Asp Leu Arg Leu Lys Met Ala Glu Arg Asn Val Met Trp Phe 

230 235 240 

Val Val Thr Ala Leu Asp Glu He Ala Trp Leu Phe Asn Leu Arg 

245 250 255 

Gly Ser Asp Val Glu His Asn Pro Val Phe Phe Ser Tyr Ala He 

260 265 270 

He Gly Leu Glu Thr He Met Leu Phe He Asp Gly Asp Arg He 

275 280 285 

Asp Ala Pro Ser Val Lys Glu His Leu Leu Leu Asp Leu Gly Leu 

290 295 300 

Glu Ala Glu Tyr Arg He Gin Val His Pro Tyr Lys Ser He Leu 

305 310 315 

Ser Glu Leu Lys Ala Leu Cys Ala Asp Leu Ser Pro Arg Glu Lys 

320 325 330 

Val Trp Val Ser Asp Lys Ala Ser Tyr Ala Val Ser Glu Thr He 

335 340 345 

Pro Lys Asp His Arg Cys Cys Met Pro Tyr Thr Pro He Cys He 

350 355 360 

Ala Lys. Ala Val Lys Asn Ser Ala Glu Ser Glu Gly Met Arg Arg 

365 370 375 

Ala His He Lys Asp Ala Val Ala Leu Cys Glu Leu Phe Asn Trp 

380 385 390 
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Leu 



Glu 



Lys 



Glu Val 
395 



Pro 



Lys 



Gly Gly 



Val 
400 



Thr Glu lie Ser 



Ala 
405 



Ala Asp Lys Ala Glu Glu Phe Arg Arg Gin Gin Ala Asp Phe Val 

410 415 420 

Asp Leu Ser Phe Pro Thr lie Ser Ser Thr Gly Pro Asn Gly Ala 

425 430 435 

lie lie His Tyr Ala Pro Val Pro Glu Thr Asn Arg Thr Leu Ser 

440 445 450 

Leu Asp Glu Val Tyr Leu lie Asp Ser Gly Ala Gin Tyr Lys Asp 

455 460 - 465 

Gly Thr Thr Asp Val Thr Arg Thr Met His Phe Gly Thr Pro Thr 

470 475 480 

Ala Tyr Glu Lys Glu Cys Phe Thr Tyr Val Leu Lys Gly His lie 

485 490 495 

Ala Val Ser Ala Ala Val Phe Pro Thr Gly Thr Lys Gly His Leu 

500 505 510 

Leu Asp Ser Phe Ala Arg Ser Ala Leu Trp Asp Ser Gly Leu Asp 

515 520 525 

Tyr Leu His Gly Thr Gly His Gly Val Gly Ser Phe Leu Asn Val 

530 535 540 

His Glu Gly Pro Cys Gly lie Ser Tyr Lys Thr Phe Ser Asp Glu 

545 550 555 

Pro Leu Glu Ala Gly Met lie Val Thr Asp Glu Pro Gly Tyr Tyr 

560 565 570 

Glu Asp Gly Ala Phe Gly lie Arg lie Glu Asn Val Val Leu Val 

575 580 585 

Val Pro Val Lys Thr Lys Tyr Asn Phe Asn Asn Arg Gly Ser Leu 

590 595 600 

Thr Phe Glu Pro Leu Thr Leu Val Pro He Gin Thr Lys Met He 

605 610 615 

Asp Val Asp Ser Leu Thr Asp Lys Glu Glu Leu Trp Asn Gly He 

620 625 630 

Leu Pro Ala Arg Ser Leu Phe Cys Leu Phe Gin Phe Thr Val Arg 

635 640 645 

Leu Ala Gin Gin Leu Pro Pro Asp Leu Gin Gly Cys Asp Trp Glu 

650 655 660 

Gly He Ala Glu Thr Gly Pro Pro Gly Ser Ser Arg Val Ala His 

665 670 675 

Gin Arg Asp Ala Thr His Leu Gin Thr Ala Leu He Asn Thr Ser 

680 685 690 

Pro Val Leu Phe Leu 



<210> 29 

<211> 140 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7510394CD1 

<400> 29 

Met Ala Ala Ala Met Pro Leu Ala Leu Leu Val Leu Leu Leu Leu 

15 10 15 

Gly Pro Gly Gly Trp Cys Leu Ala Glu Pro Pro Arg Asp Ser Leu 



695 
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20 25 30 

Arg Glu Glu Leu Val He Thr Pro Leu Pro Ser Gly Asp Val Ala 

35 40 45 

Ala Thr Phe Gin Phe Arg Thr Arg Trp Asp Ser Glu Leu Gin Arg 

50 55 60 

Glu Gly Val Ser His Tyr Arg Leu Phe Pro Lys Ala Leu Gly Gin 

65 70 75 

Leu He Ser Lys Tyr Ser Leu Arg Glu Leu His Leu Ser Phe Thr 

80 85 90 

Gin Gly Phe Trp Arg Thr Arg Tyr Trp Gly Pro Pro Phe Leu Gin 

95 100 105 

Ala Pro Ser Gly Ala Glu Leu Trp Val Trp Phe Gin Asp Thr Val 
110 115 120 

Thr Glu Phe Ser Ser Gin Leu Trp Thr Leu Lys Glu Gly Ala Glu 
125 130 135 

Val Ala Pro Gly Gin 
140 

<210> 30 
<211> 191 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500745CD1 

<400> 30 

Met Ala Ala Ala Met Pro Leu Ala Leu Leu Val Leu Leu Leu Leu 
15 10 15 

Gly Pro Gly Gly Trp Cys Leu Ala Glu Pro Pro Arg Asp Ser Leu 
20 25 30 

Arg Glu Glu Leu Val He Thr Pro Leu Pro Ser Gly Asp Val Ala 
35 40 45 

Ala Thr Phe Gin Phe Arg Thr Arg Trp Asp Ser Glu Leu Gin Arg 
50 55 60 

Glu Gly Val Ser His Tyr Arg Leu Phe Pro Lys Ala Leu Gly Gin 
65 70 75 

Leu He Ser Lys Tyr Ser Leu Arg Glu Leu His Leu Ser Phe Thr 
80 85 90 

Gin Gly Phe Trp Arg Thr Arg Tyr Trp Gly Pro Pro Phe Leu Gin 
95 100 105 

Ala Pro Ser Val Trp He Asn Leu Gly Arg Ser Ser. Val Met Ser 
110 H5 120 

Ser Gin Gly Ser Ser Ala Pro Leu Ser Thr Ser Ser Thr Pro Pro 
125 130 135 

Thr Gin Ser Leu Pro Leu Pro Pro Ser Asn Pro Trp Val Trp Pro 
140 145 150 

Met Thr Leu Thr Thr Thr Phe Cys Ala Met Leu Cys Cys Arg Gly 
155 160 165 

Arg Trp Ser Ala Pro Lys Thr Ser Pro Pro Gly Arg Ser Ser Cys 
17 0 175 180 

Pro Val Val Pro Arg Gin Ala Ser Leu Cys Cys 
185 190 

<210> 31 
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<211> 145 
<212> PRT 
<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500929CD1 

<400> 31 

Met Ser Ala Trp Ala Ala Ala Ser Leu Ser Arg Ala Ala Ala Arg 
15 10 15 

Cys Leu Leu Ala Arg Gly Pro Gly Val Arg Ala Ala Pro Pro Arg 

20 • 25 30 

Asp Pro Arg Pro Ser His Pro Glu Pro Arg Gly Cys Gly Ala Ala 

35 40 45 

Pro Gly Arg Thr Leu His Phe Thr Ala Ala Val Pro Ala Gly His 

50 55 60 

Asn Lys Trp Ser Lys Val Arg His lie Lys Gly Pro Lys Asp Val 

65 70 75 

Glu Arg Ser Arg He Phe Ser Lys Leu Cys Leu Asn He Arg Leu 

80 85 90 

Ala Val Lys Ala Arg Arg Pro Lys Asp Arg Thr Cys Asp Leu Glu 

95 100 105 

Ala Lys Gly He Ser Leu Val Gly Pro Pro Cys Gin Leu Cys Cys 
110 115 120 

Cys Leu Arg Ala He Trp Met Ser Val Pro Thr Pro Ser Arg Met 
125 130 135 

Gin Gly Arg Thr Thr Gin Leu Val Arg Leu 
140 145 

<210> 32 

<211> 2129 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> "mi sc_f eature 

<223> Incyte ID No: 8268274CB1 

<400> 32 

ggcggcggcg gccggagccg gaaggcgggg aggggccggc cgttgggccc gaggcggcgg 60 
cggcggcggc ggcggctggg gagaagcgct ctcgtcgcct gcccgaggcc ggagcggcgg 120 
ggcccgcgcc tcctcccccc agcgccgcgg aggggggagg aggaagatgg agacccacat 180 
ctcatgcctg ttcccggagc tgctggccat gatcttcggc tacctggacg tccgggacaa 240 
ggggcgcgcg gcgcaggtgt gcaccgcctg gcgggacgcc gcctaccaca agtcggtgtg 300 
gcggggggtg gaggccaagc tgcacctgcg ccgggccaac ccgtcgctgt tccccagcct 360 
gcaggcccgg ggcatccgcc gggtgcagat cctgagcctc cgccgcagcc tcagctacgt 420 
gatccagggc atggccaaca tcgagagcct caacctcagc ggctgctaca acctcaccga 480 
caacgggctg ggccacgcgt ttgtgcagga gatcggctcc ctgcgcgctc tcaacctgag 540 
cctctgcaag cagatcactg acagcagcct gggccgcata gcccagtacc tcaagggcct 600 
ggaggtgctg gagctgggag gttgcagcaa catcaccaac actggccttc tgctcatcgc 660 
ctggggtctg cagcgcctca agagccttaa cctccgcagc tgccgccacc tttcggatgt 720 
gggcatcggg cacctggccg gcatgacgcg cagcgcggcg gagggctgcc tgggcctgga 780 
gcagctcacg ctacaggact gccagaagct cacagatctt tctctaaagc acatctcccg 840 
agggctgacg ggcctgaggc tcctcaacct cagcttctgt gggggaatct cggacgctgg 900 
cctcctgcac ctgtcgcaca tgggcagcct gcgcagcctc aacctgcgct cctgtgacaa 960 
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catcagtgac acgggcatca tgcatctggc catgggcagc ctgcgcctct cggggctgga 1020 
tgtctcgttc tgtgacaagg tgggagacca gagtctggct tacatagccc aggggctgga 1080 
tggcctcaag tctctctccc tctgctcctg ccacatcagt gatgatggca tcaaccgcat 1140 
ggtgcggcag atgcacgggc tgcgcacgct caacattgga cagtgtgtgc gcatcacgga 1200 
caagggcctg gagctgatcg ctgagcacct gagccaactc accggcatag acctgtacgg 1260 
ctgcacccga atcaccaagc gcggcctgga gcgcatcacg cagctgccgt gcctcaagga 1320 
ggcacgaggg gatttttctc cattattcac tgtgagaact cggggaagct ccagaaggtg 1380 
agggagaggg gacaacgaca tggttcccgt ggatctttaa cttccagact tgcccgctct 1440 
gcgcctctgg cactctggtg atgacagctc aggtttccct gcctgtcact gctcgggcag 1500 
aggctgctgc ccagggcttc tgctccggta ccttgtgaag ctgcattctc ctgccggttt 1560 
ctccagttct ggggacagtg gtttgctctg agacctcgct tcctttatgg atccaaggag 1620 
acttgctttt tcagtctgtt cagcttttta cttgctagga tggaattgca atttgcaagc 1680 
ttcttcgaca ggaaactaca agttccacac tttaatttta tacatataaa tatatacatg 1740 
tgtacatata tctatgtaca ggggtattat atatatacat ataagatgat gatatatata 1800 
atgatgatat gtattactga gaacgtaaaa tatcattaca tagtgatagc tggacacaca 1860 
aggaattcac aactccccaa agaaaataca tctggatgac ctgcctagca gtttccccat 1920 
gagatagagg aatgtctacg tatttcattc cctgttcctg ccctgaaaca atttcaatca 19 80 
ctgacaaatc attatcattc attaataatg tttactgagt gcccatatgt gaaagaaatc 2040 
cactctacat tccacagatg catttcctct ccccacgggg tttccatttt aatgggaaca 2100 
atgtagaata tatctgtctt cccttaaaa 2129 

<210> 33 
<211> 3489 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500515CB1 

<400> 33 

ggcagacact ggagccacga 
cgtcctgctt tcactgctgg 
catctacagc ctcaccgtgg 
cagccgagtg gtcaataggg 
caagaaagcc ttcatcacca 
catcaaggag aaggctgaag 
cgctggcctc gtcaaggcca 
ggctcccaat gccaagatca 
gggggtgtac gagctgctgc 
ggacattcac atcttcgagc 
gaccaaccag ctggtagacg 
gttcaagcca acactttccc 
cggcaacctc attatccgct 
cgagaacggc tactttgtac 
tgtggtcttt gtcattgaca 
ggaagcccta atcaagatcc 
cttcagtaca gaagcaactc 
gaacaaggcc aggagctttg 
tgcaatgctg atggctgtgc 
agggagtgtc tcactcatca 
ccccaggagc atccagaata 
cctgggcttc ggtttcgacg 
cggcctggcc cggcgcatcc 
ccaggaagtg gccaacccac 
ggaggaggtc actcagaaca 



tgaagccccc aaggcctgtc 
ccatccacca gactactact 
actccagggt ctcatcccga 
ccaatactgt gcaggaggcc 
acttctccat gatcatcgat 
cccaggcaca gtacagcgca 
ccgggagaaa catggagcag 
cctttgagct ggtctatgag 
tgaaagtgcg gccccagcag 
cccagggcat cagctttctg 
ccctcaccac ctggcagaat 
agcagcaaaa gtccccagag 
atgatgtgga ccgggccatc 
actactttgc ccccgagggc 
agagcggctc catgagtggc 
tggatgacct cagccccaga 
agtggaggcc atcactggtg 
ctgcgggcat ccaggccctg 
agttgctgga cagcagcaac 
tcctgctcac cgatggcgac 
acgtgcggga agctgtaagt 
tcagctatgc cttcctggag 
atgaggactc agactctgcc 
tgctgacagc agtgaccttc 
acttccggct cctcttcaag 



cgtacctgca gcaaagttct 60 
gccgaaaaga atggcatcga 120 
tttgcccaca cggtcgtcac 180 
accttccaga tggagctgcc 240 
ggcatgacct acccagggat 300 
gcagtggcca agggaaagag 360 
ttccaggtgt cggtcagtgt 420 
gagctgctca agcggcgttt 480 
ctggtcaagc acctgcagat 540 
gagacagaga gcaccttcat 600 
aagaccaagg ctcacatccg 660 
cagcaagaaa cagtcctgga 720 
tccgggggct ccattcagat 780 
ctaaccacaa tgcccaagaa 840 
aggaaaatcc agcagacccg 900 
gaccagttca acctcatcgt 960 
ccagcctcag ccgagaacgt 1020 
ggagggacca acatcaatga 1080 
caggaggagc ggctgcccga 1140 
cccactgtgg gggagactaa 1200 
ggccggtaca gcctcttctg 1260 
aagctggcac tggacaatgg 1320 
ctgcagctcc aggacttcta 13 80 
gagtacccaa gcaatgccgt 1440 
ggctcagaga tggtggtggc 1500 
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tgggaagctc caggaccggg ggcctgatgt gctcacagcc acagtcagtg ggaagctgcc 1560 
tacacagaac atcactttcc aaacggagtc cagtgtggca gagcaggagg cggagttcca 1620 
gagccccaag tatatcttcc acaacttcat ggagaggctc tgggcatacc tgactatcca 1680 
gcagctgctg gagcaaactg tctccgcatc cgatgctgat cagcaggccc tccggaacca 1740 
agcgctgaat ttatcacttg cctacagctt tgtcacgcct ctcacatcta tggtagtcac 1800 
caaacccgat gaccaagagc agtctcaagt tgctgagaag cccatggaag gcgaaagtag 1860 
aaacaggaat gtccactcag ctggagctgc tggctcccgg atgaatttca gacctggggt 1920 
tctcagctcc aggcaacttg gactcccagg acctcctgat gttcctgacc atgctgctta 1980 
ccaccccttc cgccgtctgg ccatcttgcc tgcttcagca ccaccagcca cctcaaatcc 2040 
tgatccagct gtgtctcgtg tcatgaatat gaaaatcgaa gaaacaacca tgacaaccca 2100 
aaccccagcc cccatacagg ctccctctgc catcctgcca ctgcctgggc agagtgtgga 2160 
gcggctctgt gtggacccca gacaccgcca ggggccagtg aacctgctct cagaccctga 2220 
gcaaggggtt gaggtgactg gccagtatga gagggagaag gctgggttct catggatcga 2280 
agtgaccttc aagaaccccc tggtatgggt tcacgcatcc cctgaacacg tggtggtgac 2340 
tcggaaccga agaagctctg cgtacaagtg gaaggagacg ctattctcag tgatgcccgg 2400 
cctgaagatg accatggaca agacgggtct cctgctgctc agtgacccag acaaagtgac 2460 
catcggcctg ttgttctggg atggccgtgg ggaggggctc cggctccttc tgcgtgacac 2520 
tgaccgcttc tccagccacg ttggagggac ccttggccag ttttaccagg aggtgctctg 2580 
gggatctcca gcagcatcag atgacggcag acgcacgctg agggttcagg gcaatgacca 2640 
ctctgccacc agagagcgca ggctggatta ccaggagggg cccccgggag tggagatttc 2700 
ctgctggtct gtggagctgt agttctgatg gaaggagctg tgcccaccct gtacacttgg 2760 
cttccccctg caactgcagg gccgcttctg gggcctggac caccatgggg aggaagagtc 2820 
ccactcatta caaataaaga aaggtggtgt gagcctggga aaaaaaaaaa aaaaaaaaaa 2880 
aaaaaaaaaa aaaaaaaagg gggggccccc aaaataagga cccccaaccc cgggggatat 2940 
aaatactcgg ggacaagcgc ttaccactgg cggaggcgtg tttaatccca caccccacat 3000 
ggggggggca acgttatatt cccgtattgt cacgaggggc atccccacta aatgaggggc 3060 
ggcgtaatta aactatctcg gcaaaaggac ccagtggaat gaccccgtga tttatatgta 3120 
ctgacgcaga caacgacaca ctagctcaac aacacgacag ccacatcagt acctcgtcga 3180 
catgctgacg aagagtcgga ccccacatac acacaactaa aacaaccaaa ctctacacaa 3240 
caaactacac acatctaatc tccgactcag caccccaacc cacacccata acacacacac 3300 
acagaacaac caacaatatc atactatcat taactataaa acgacaaacc ctcataacac 3360 
ttatataatg cagtacatcc taatcacacc acaacaacaa aaaaacaacc atcatacatc 3420 
atccactaac actacattac aaaaccatca aaaaacgcca cacacccacc acactctcca 3480 
ctattctct 3489 



<210> 34 

<211> 2996 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 2256826CB1 

<220> 

<221> unsure 

<222> 360, 370 

<223> a, t, c, g, or other 



<400> 34 

ctgtttgcgc gcggacggag gagcggtgga 
gtgcggtcct gctctcctct ccctcgctgg 
ccccagaccc ccttttcccc cccccccggc 
tcgcctgcct cggaggcgca gggggtcgtg 
cgccgtcagg gcgggccccg tgtgggggag 
ctccctttcg gcccctcccc ctaccgggcg 



ctcggggcag cggaggggcc cccgcgcacc 60 
tccgcgagca cgcgcgccct tgcatccgcc 120 
gcttccgtgt ctcgcctcct cccggtgggc 180 
gcgccgccgc gcaccggctg ctcccgggag 240 
gggtggtttt ggaccttttc cgtaggggtc 300 
ctccgaggcc ctggcggctc tgtccaatgn 360 
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agcagtggcn gttgctggca ggtggggagt gtttttgttt tgggttgaag ttgaggctga 420 
ggagagagcc gagctagcga cgagcagtcg ttgcggccgc cggcgccgcg ggaggtggtg 480 
gaggcctagc cggagccgag aggtctcttg ttcccgtccc acggtcccgg cgtcacccct 540 
ccggcgccca gtccccgtcc cggaactccc gggcctgtcc tgggcccccg gtctgtgcac 600 
tccgctcgcc gcagcgcccg gcccgggccg cacccgccgg ccccatgagg agggacgtga 660 
acggagtgac caagagcagg tttgagatgt tctcaaatag tgatgaagct gtaatcaata 720 
aaaaacttcc caaagaactc ctgttacgga tattttcttt tctagatgtt gttaccctgt 780 
gccgctgtgc tcaggtctcc agggcctgga atgttctggc tctggatggc agtaactggc 840 
agcgaattga cctatttgat ttccagaggg atattgaggg ccgagtagtg gagaatattt 900 
caaaacgatg tgggggcttt ttacgaaagt taagtcttcg tggatgtctt ggagtgggag 960 
acaatgcatt aagaaccttt gcacaaaact gcaggaacat tgaagtactg aatctaaatg 1020 
ggtgtacaaa gacaacagac gctacatgta ctagccttag caagttctgt tccaaactca 1080 
ggcaccttga cttggcttcc tgtacatcaa taacaaacat gtctctaaaa gctctgagtg 1140 
agggatgtcc actgttggag cagttgaaca tttcctggtg tgaccaagta accaaggatg 1200 
gcattcaagc actagtgagg ggctgtgggg gtctcaaggc cttattctta aaaggctgca 1260 
cgcagctaga agatgaagct ctcaagtaca taggtgcaca ctgccctgaa ctggtgactt 1320 
tgaacttgca gacttgcttg caaatcacag atgaaggtct cattactata tgcagagggt 1380 
gccataagtt acaatccctt tgtgcctctg gctgctccaa catcacagat gccatcctga 1440 
atgctctagg tcagaactgc ccacggctta gaatattgga agtggcaaga tgttctcaat 1500 
taacagatgt gggctttacc actctagcca ggaattgcca tgaacttgaa aagatggacc 1560 
tggaagagtg tgttcagata acagatagca cattaatcca actttctata cactgtcctc 1620 
gacttcaagt attgagtctg tctcactgtg agctgatcac agatgatgga attcgtcacc 1680 
tggggaatgg ggcctgcgcc catgaccagc tggaggtgat tgagctggac aactgcccac 1740 
taatcacaga tgcatccctg gagcacttga agagctgtca tagccttgag cggatagaac 1800 
tctatgactg ccagcaaatc acacgggctg gaatcaagag actcaggacc catttaccca 1860 
atattaaagt ccacgcctac ttcgcacctg tcactccacc cccatcagta gggggcagca 19 20 
gacagcgctt ctgcagatgc tgcatcatcc tatgacaatg gaggtggtca accttggcga 1980 
actgagtatt taatgacact tctagagcta ccgtggagtc tctccagtgg aagcaacccc 2040 
agtgttctga gcaagggtta caaagtgagg gagggcagtg tccagatccc cagagccaca 2100 
catacataca catacacacc cttaccccca tccactctag ctttgtgacc atgggactga 2160 
agtttgtgat ggctttttta tcaagtagat tggtaaaatt taaccattcc tgttgaggtg 2220 
cccataagaa aatcataggc caagataggg aggggcattc cagcaaaccc cgtgttaatg 2280 
ctactgtggt ttttaaattt ttgtctaggg gtttctttgg ggattttaga acagcatctg 2340 
ctgtcctccg gggtcaagaa aagcatggaa agacaatata tgatgtaccc agggaccaga 2400 
aagaaaattt ctttgcatct tagaaatggt agacattcat tgtgactaaa gagcttctat 2460 
gcttccttgt ttccatgcca acatgctgag catgctcaca aagaaggctc gtccattcct 2520 
cctgtgtttt agtatttggc ccagaggttt cctaaatggt tgccttgaaa tcactgtggt 2580 
ccaaatgtaa ttcttacaca ctcaaattat cactgtctgt agcacacttg tgcacctgtc 2640 
ttacattctc tgttgctccc ccccacactc ttgctcagtc tgtcacctgt tcagtctgct 2700 
tactcactca attgttaccc ttttgctgtt gtcgtgttta cagtttgcat tttgaatgat 2760 
tagttgggat taccaaacat tttttaaaaa gatattatca ataaatattt ttttaattct 2820 
aaattttaaa aaaaaaaaaa aagggggggg ccgcttaaaa ggtcccaagt ttgattacgc 2880 
ttgcttccga cgtcatagcg ggcggcagaa ttccgatatc aagcttttgg atccggggac 2940 
ttcggggggg cccgttccca atgcgcctat gtgattgatt acgccccaca ggcgct 2996 

<210> 35 

<211> 1860 

<212> DNA 

<213> Hoirto sapiens 

<220> 

<221> miscjceature 

<223> Incyte ID No: 7686186CB1 

<400> 35 

gcctccggac tgaccttgcg gctgtatggc gcacgccccc tgtacccgga tgaagcgccc 60 
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gatctttggg ctgtgttcgc gcgaactggc ggcgcgggcc gggctgcctg ccgtgcccgg 120 

taccgcacta cgtgcccagc ggggtcgtca acgccttcgc caccggttcg aagcatcacg 180 

cggccatcgc gctgaccgac ggcctgctgc gtagcctcac gccccgcgag ttgaccggcg 240 

tgctcggcca cgaaatcgcg catattgcga acgaggattt gcgtgtcatg ggcctggccg 300 

attccatcag ccggctgacc catctgctgg ccctgctggg gcagcttgcg atcgtgctca 360 

gcttgccagc gctgctgctt ggagtcgcgg aagtcaattg gcccgcgttg cttctactgg 420 

cggtcgcgcc acagctggcc ttgctggctc agttgggctt gtccagggtg cgcgaattcg 480 

acgccgaccg gctcgctgcc gaattgaccg gcgacccgca cgggctggcc tcggcgctcg 540 

ccaagatcga gcgggtgagc cgctcctggc gcgcctggct gctgccccgg atgaggcaat 600 

ccggaaccct cctggttgcg cacgcatccg gcgacggctg aacgcattga gcgcttgctg 660 

gaacttgctc cgccgcccgc gatgccgccg tttccatcgg cccgtttcgt ccccgaggtg 720 

accgtatcac cacgtccgcc acgctggcgc accggcggcc tttgacgctg atttcaacat 780 

ggagaatgac gtgacctacc ccgaccccta cagccgcccg gcgccggacc gcttcatccg 840 

gcgctggctc gtcatcactg gctgcatcgc cgcactcatg ctgctgtggc agttcctgcc 900 

cgccatcgaa gcctggttca gtccccacga aacgcaggag cgcacggtga cgccgcgcgg 960 

cgacctggcc gccgacgaaa aaaccaccat cgagctgttc gagaaatcgc gcgggtcggt 1020 

ggtttacatc accacggcac aactagtgcg tgacgtctgg tcgcgcaatg tcttttccgt 1080 

gccgcgcggc accggctccg gcttcatctg ggacgatgcc ggccacgtgg tgaccaactt 1140 

ccacgtgatc cagggggcat cgtctgccac ggtcaaactg gccgacggtc gcgattatca 1200 

ggctgcgctc gttggcgcca gtcctgcgca cgacatcgcg gtactcaaga ttggcgtcgg 1260 

cttcaagcgc ccgccggcgg tgccggtggg caccagtgcc gatctcaagg tggggcaaaa 1320 

ggtctttgcc attggcaatc ccttcgggct cgactggacg ctcaccaccg gcatcgtctc 1380 

ggcgcttgac cgcacccttt ccggcgacgc cagtggcccg gccattgacc acctgatcca 1440 

gaccgacgcc gctatcaacc ccggcaattc cggtggcccg ctgctcgatt cggctgggcg 1500 

gctgatcggc atcaataccg ccatctacag tccgtctggc gcctcggccg gcatcggctt 1560 

tgcggtgccg gtcgataccg tcatgcgcgt ggtgccgcaa ctcataaaga ccggcaagta 1620 

catccgtccg gcgctgggca tcgaggtgga tgagcagctc aacgcgcgtc tgcaggcgct 1680 

gaccggcagt aagggcgtat tcgtattgcg cgtgacgccg ggctcggcgg cgcacagggc 1740 

cgggctcgtc ggcgtcgagg tcaccgcagg cggcatcgtg cccggcgatc gcgttatcag 1800 

catcgacggt atcgccgtcg acccgggaat tccggaccgt acctgctgac ctcttcagac 1860 

<210> 36 
<211> 1334 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 72617436CB1 

<400> 36 

cctgatgatc tgtcactgtc tcccatcact cccagatggg accgtctagt ctcaggaaaa 60 
caagctctgg gctcccactg attctacatt atggtgtgat cctaggagcg cccctggcct 120 
ccagctgcgc aggagcctgt ggtaccagct tcccagatgg cctcacccct gagggaaccc 180 
aggcctccgg ggacaaggac attcctgcaa ttaaccaagg gctcatcctg gaagaaaccc 240 
cagagagcag cttcctcatc gagggggaca tcatccggcc gagtcccttc cgactgctgt 300 
cagcaaccag caacaaatgg cccatgggtg gtagtggtgt cgtggaggtc cccttcctgc 360 
tctccagcaa gtacgatgag cccagccgcc aggtcatcct ggaggctctt gcggagtttg 420 
aacgttccac gtgcatcagg tttgtcacct atcaggacca gagagacttc atttccatca 480 
tccccatgta tgggtgcttc tcgagtgtgg ggcgcagtgg agggatgcag gtggtctccc 540 
tggcgcccac gtgtctccag aagggccggg gcattgtcct tcatgagctc atgcatgtgc 600 
tgggcttctg gcacgagcac acgcgggccg accgggaccg ctatatccgt gtcaactgga 660 
acgagatcct gccaggcttt gaaatcaact tcatcaagtc tcggagcagc aacatgctga 720 
cgccctatga ctactcctct gtgatgcact atgggaggct tgccttcagc cggcgtgggc 780 
tgcccaccat cacaccactt tgggccccca gtgtccacat cggccagcga tggaacctga 840 
gtgcctcgga catcacccgg gtcctccaac tctacggctg cagcccaagt ggccccaggc 900 
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cccgtgggag agggtcccat gcccacagca ctggtaggag ccccgctccg gcctccctat 960 
ctctgcagcg gcttttggag gcactgtcgg cggaatccag gagccccgac cccagtggtt 1020 
ccagtgcggg aggccagccc gttcctgcag ggcctgggga gagcccacat gggtgggagt 1080 
cccctgccct gaaaaagctc agtgcagagg cctcggcaag gcagcctcag accctagctt 1140 
cctccccaag atcaaggcct ggagcaggtg cccccggtgt tgctcaggag cagtcctggc 1200 
tggccggagt gtccaccaag cccacagtcc catcttcaga agcaggaatc cagccagtcc 1260 
ctgtccaggg aagcccagct ctgccagggg gctgtgtacc tagaaatcat ttcaagggga 1320 
tgtccgaaga ttaa 1334 

<210> 37 
<211> 2070 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 7501945CB1 

<400> 37 

gctggaacag tgggtcctcg tcgtgactca gagccaggcc cagtgctgcc ccgatcagac 60 

cccgcttcct tttcggaatg tggaaggatg cgagtgtgtg cctgagggtg tggtcctgtg 120 

ggggtgggtg gtcttggccc gtgcacacct gcatgttgcg catggtgttc ctgtgggagc 180 

gtgtgtgcac gcccacacaa gaggcacttc ccaggccgtg tgggtctcta cttcctcaca 240 

gcaggcctgc atggcctccc tactcttctc ccacatccca gggctggggc tcacagcgtc 300 

ccctgttcca gagctccttc atgtccccca aaggcatgtg gccctgtgtc tggtgcagag 360 

gacctgtcac cagggatggg attcattgct ctgcacaccc acccaactag cacatcttgg 420 

ccgagcttca attacgggcc cagcatagga ctaaggttca gtggagacct gagaccagtg 480 

ccccgcccac gccagaagcc aaggggtacc tgccgacgct gccgggaggg ggtgctgtga 540 

tctgcctgtg agcaggggtg ctgaggcctt tggaaatggt tgtgcgggag ccagtcctgc 600 

ttcaggctca ctgggacact cggccatttg gagtgtcatc cagccaaggg ttgtctctgc 660 

acagtgctac ctctaaggat ctgtccgtgg tctgttagga actgggctgc acagccggag 720 

gtgagcagct tcatcagtat tacagccgca ccccattgct agcattactg cctgagctcc 780 

gcctcctatc agatcagcgg tggcattaga ttctcatagg agcttgaacc ctgttgtgaa 840 

ctgcattgga gggatatagg atgtatgctc cttatgaaac tctaactaat gcctgatgat 900 

ctgtcactgt ctcccatcac tcccagatgg gaccgtctag tctcaggaaa acaagctctg 960 

ggctcccact gattctacat tatggtgtga tcctaggagc gcccctggcc tccagctgcg 1020 

caggagcctg tggtaccagc ttcccagatg gcctcacccc tgagggaacc caggcctccg 1080 

gggacaagga cattcctgca attaaccaag ggctcatcct ggaagaaacc ccagagagca 1140 

get tec teat cgagggggac atcatccggc cgagtccctt ccgactgctg tcagcaacca 1200 

gcaacaaatg gcccatgggt ggtagtggtg tcgtggaggt ccccttcctg ctctccagca 1260 

agtacgatga gcccagccgc caggtcatcc tggaggctct tgcggagttt gaacgttcca 1320 

cgtgcatcag gtttgtcacc tatcaggacc agagagactt catttccatc atccccatgt 1380 

atgggtgctt ctcgagtgtg gggcgcagtg gagggatgea ggtggtctcc ctggcgccca 1440 

cgtgtctcca gaagggcegg ggcattgtcc ttcatgagct catgcatgtg ctgggcttct 1500 

ggcacgagca cacgcgggcc gaccgggacc gctatatcca tgtcaactgg aacgagatcc 1560 

tgecaggett tgaaatcaac ttcatcaagt cteggagcag caacatgetg acgccctatg 1620 

actactcctc tgtgatgcac tatgggaggg tcccatgccc acagcactgg taggagcccc 1680 

gctccggcct ccctatctct geageggett ttggaggcac tgtcggcgga atccaggagc 1740 

cccgacccca gtggttccag tgegggagge cagcccgttc ctgcagggcc tggggagagc 1800 

ccacatgggt gggagtcccc tgccctgaaa aagctcagtg cagaggcctc ggcaaggcag 1860 

cctcagaccc tggcttcctc cccaagatca aggectggag caggtgcccc cggtgttgct 192 0 

caggagcagt cctggctggc cggagtgtcc accaagccca cagtcccatc ttcagaagca 1980 

ggaatccagc cagtccctgt ccagggaagc ccagctctgc cagggggctg tgtacctaga 2040 

aatcatttca aggggatgtc cgaagattaa 2070 



<210> 38 
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<211> 2265 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500264CB1 

<400> 38 

gggttcgggg gcgccgcgct gtgaggccgg ggcctagagc cagccgcggc cgcgcaggag 60 
gggcccaggg cccgcgctcg cccgcgtccc cgccttcctc ccgcgctcag ccccgcctcg 120 
gctcgctgcc cttggctctc gtcgccatgg cctccgtcgc ccaggagagc gcgggctcgc 180 
agcgccggct accgccgcgt cacggggcgc tgcgcgggct gctactgctc tgcctgtggc 240 
tgccaagcgg ccgtgcggcc ttgccgcccg cggcgccgct gtccgaactg cacgcgcagc 300 
tgtcgggcgt ggagcagctg ctggaggagt tccgccggca actgcagcag gagcggcctc 360 
aggaggagct ggagctggag ctgcgcgcgg gcggcggccc ccaggaggac tgcccgggcc 420 
cgggcagcgg cggctacagc gcaatgcctg acgccatcat ccgcaccaag gactccctgg 480 
cggcgggtgc cagcttcctg cgggcgccgg cggccgtgcg gggctggcgg caatgcgtgg 540 
cggcctgctg ctccgagccg cgctgctccg tggccgtggt ggagctgccc cggcgccccg 600 
cgcccccggc agccgtgctc ggctgctacc tcttcaactg cacggcgcgc ggccgcaacg 660 
tctgcaagtt cgcgctgcac agcggctaca gcagctacag cctcagccgc gcgccggacg 720 
gcgccgccct ggccaccgcg cgcgcctcgc cccggcagga aaaggatgcg cctccactta 780 
gcaaggctgg gcaggatgtg gttctgcatc tgcccacaga cggggtggtt ctagacggcc 840 
gcgagagcac agatgaccac gccatcgtcc agtatgagtg ggcactgctg cagggggacc 900 
cgtcagtgga catgaaggtg cctcaatcag ggggtgactc cttggtggaa aagtctcaga 960 
aagccactgc cccaaacaag ccacctgcat tatcaaacac agagaagagg aatcattccg 1020 
ccttttgggg accagagagt caaatcattc ctgtgatgcc agatagtagt tcctcaggga 1080 
agaacagaaa agaggaaagt tatatatttg agtcaaaggg tgatggagga ggaggggaac 1140 
acccagcccc agaaacaggt gcagtgctac ccctggcgct gggtttggct atcactgctc 1200 
tgctgcttct catggttgca tgccgactac gactggtgaa acagaaactg aaaaaagctc 1260 
gtcccattac atctgaggaa tcggactacc tcataaatgg gatgtatcta tagtaatgta 1320 
atttcaatac cttggggcag ggacatgttt tgtttataat ttatacatct attaagttct 1380 
ggatatttac agcttctttt gtttttaatt gggccagaag attctgcaaa tcccaaatct 1440 
ttctttatta tttattgtaa aaaaagtttc cttagaagtc ataaaatatt ttgaaattta 1500 
gagaggaatt catgattaaa gattcctaaa aatataattc tgatttatgt aagctgtccc 1560 
tgaaaataga aatgtgtact tagctgagag aaaattcagc atctcaggag gtggtattag 1620 
gatgactgtg ttaacccatt accttttaga agccaactgt tggcccctta ccatgctgga 1680 
ctgctatagg cccagcttcc ccttgttctg tggccctttt cttcctcctt gaagctccca 1740 
gtattctttt tcttttcccc tctaaacctg tttctgagag tggatctcaa gcaagttcat 1800 
gccttcaatc agatgttact tagggtgggt atacctaaat tataaacctt atgtacaagt 1860 
cagtaagcct tagggaaggt gagtgtgggt ccttcctaat ccctctgacg tcatgtcata 1920 
taggtggctg cctccttaga ctgacctttg ggagaaaaaa accccagact ttgaattagt 1980 
aacagctcta agatggtcat gcagtgagat aggaaatcaa gatggaagca gagaatctgg 2040 
catgccaaaa actaacagaa acttagttga aggcaaagag agcaaggaga acgtttaata 2100 
cttcattaca tcaaatcaac actgctccat ggtgagagca cagcaactca tttatatata 2160 
tatatatagg ctttgttgat gaaaaacgac aattgaagag aggacgttga gtggattcct 2220 
gggtacagct tttgtaaaaa tgtcaccatg gctttcatcc aatgg 2265 

<210> 39 

<211> 1834 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mi sc_f eature 

<223> Incyte ID No: 7499935CB1 



46/66 



WO 03/025131 



PCT/US02/29221 



<400> 39 

gtgacgccgg tgccgggcga acatggcggc ggccaccgga ccctcgtttt ggctggggaa 60 
tgaaaccctg aaggtgccgc tggcgctctt tgccttgaac cggcagcgcc tgtgtgagcg 120 
gctgcggaag aaccctgctg tgcaggccgg ctccatcgtg gtcctgcagg gcggggagga 180 
gactcagcgc tactgcaccg acaccggggt cctcttccgc caggagtcct tctttcactg 240 
ggcgttcggt gtcactgagc caggctgcta tggtgtcatc gatgttgaca ctgggaagtc 300 
gaccctgttt gtgcccaggc ttcctgccag ccatgccacc tggatgggaa agatccattc 360 
caaggagcac ttcaaggaga agtatgccgt ggacgacgtc cagtacgtag atgagattgc 420 
cagcgtcctg acgtcacaga agccctctgt cctcctcact ttgcgtggcg tcaacacgga 480 
cagcggcagt gtctgcaggg aggcctcctt tgacggcatc agcaagttcg aagtcaacaa 540 
taccattctt cacccagaga tcgttgagtg ccgagtgttt aagacggata tggagctgga 600 
ggttctgcgc tataccaata aaatctccag cgaggcccac cgtgaggtaa tgaaggctgt 660 
aaaagtggga atgaaagaat atgagttgga aagcctcttc gagcactact gctactcccg 720 
gggcggcatg cgccacagct cctacacctg catctgcggc agtggtgaga actcagccgt 780 
gctacactac ggacacgccg gagctcccaa cgaccgaacg atccagaatg gggatatgtg 840 
cctgttcgac atgggcggtg agtattactg cttcgcttcc gacatcacct gctcctttcc 900 
cgccaacggc aagttcactg cagaccagaa ggccgtctat gaggcagtgc tgcggagctc 960 
ccgtgccgtc atgggtgcca tgaagccagg tgtctggtgg cctgacatgc accgcctggc 1020 
tgaccgcatc cacctggagg agctggccca catgggcatc ctgagcggca gcgtggacgc 1080 
catggtccag gctcacctgg gggccgtgtt tatgcctcac gggcttggcc acttcctggg 1140 
cattgacgtg cacgacgtgg gaggctaccc agagggcgtg gagcgcatct acttcatcga 1200 
ccacctcctg gatgaggccc tggcggaccc ggcccgcgcc tccttcctta accgcgaggt 1260 
cctgcagcgc tttcgcggtt ttggcggggt ccgcatcgag gaggacgtcg tggtgactga 1320 
cagcggcata gagctgctga cctgcgtgcc ccgcactgtg gaagagattg aagcatgcat 1380 
ggcaggctgt gacaaggcct ttaccccctt ctctggcccc aagtagagcc agccagaaat 1440 
cccagcgcac ctgggggcct ggccttgcaa cctcttttcg tgatgggcag cctgctggtc 1500 
agcactccag tagcgagaga cggcacccag aatcagatcc cagcttcggc atttgatcag 1560 
accaaacagt gctgtttccc ggggaggaaa cactttttta attacccttt tgcaggctcc 1620 
cacctttaat ctgttttata ccttgcttat taaatgagcg acttaaaatg attgaaaata 1680 
atgctgttct ttagtagcaa ctaaaatgtg tcttgctgtc atttatattc cttttcccag 1740 
gaaagaagca tttctgatac tttctgtcaa aaatcaatat gcagaatggc atttgcaata 1800 
aaaggtttcc taaaaaaaaa aaaaaaaaaa aaaa 1834 

<210> 40 

<211> 1524 

<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7982285CB1 

<400> 40 

gggacatata aaaatgccat tgtaactact gtagagtaaa gtgttagctg cgctgccgga 60 

ggaaacggaa gaaggagcaa gctatggagg ggaacaggga tgaggctgag aaatgtgtcg 120 

agatcgcccg ggaggccctg aacgccggca accgcgagaa ggcccagcgc ttcctgcaga 180 

aggccgagaa gctctaccca ctgccctcgg cccgcgcact attggaaata attatgaaaa 240 

atggaagcac ggctggaaat agccctcatt gccgaaaacc atcaggtagt ggcgatcaaa 300 

gcaagcctaa ttgcacaaag gacagcacat ctggtagtgg tgaaggtgga aaaggctata 360 

ccaaagacca agtagatgga gttctcagca taaacaaatg taaaaattac tatgaagtac 420 

ttggagttac gaaagatgct ggtgatgaag atttgaaaaa agcttataga aagcttgctt 480 

tgaagtttca tccagacaaa aaccatgcac ctggagcaac agatgctttt aaaaagattg 540 

gaaatgctta tgctgtttta agtaatccag aaaagcgaaa acagtatgac ctcacgggca 600 

atgaagaaca agcatgtaac caccaaaaca atggcagatt taatttccat agaggttgtg 660 

aagctgatat aactccagaa gacttgttta atatattttt tgggggtgga tttccttcag 720 

gtagtgtaca ttctttttca aatggaagag ctggttatag ccaacaacat cagcatcgac 780 
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atagtggaca tgaaagagaa gaggaaagag 
tgatgcccat aattgtattg atcctcgtgt 
ctccttattc cttatatccc agatctggaa 
acttgggtgt tgtttattat gtcaacaagg 
tacaaaaggt agaaaagagt gtggaggaag 
ggaaagaaag acaacaaaaa acagatatgc 
gactccgaag gaaggcagat gccttgagca 
ccagtcttta taaaggagga tgaactggaa 
attttttctg taagtaagtt tggtttcatc 
aaaactaaac tgaatagttg gttcctgaaa 
ttaaatagta actgaaaact aaaatggaat 
attttaaaag cttacatgat tcctaaacta 
tagcaatttc cagttttagt gatt 



gagatggagg tttttctgtg tttatccagc 840 
cattattaag ccagttgatg gtctctaatc 900 
ctgggcaaac tattaaaatg caaacagaaa 960 
acttcaaaaa tgaatataaa ggaatgttat 1020 
attatgtgac taatattcga aataactgct 1080 
agtatgcagc aaaagtatac cgtgatgatc 1140 
tggacaactg taaagaatta gagcggctta 1200 
tttttattta taccttttag cgtactcttt 1260 
atgagggatg aaggaaaaga tttgatactg 1320 
tcttggactg tttatgacct actggctcct 1380 
attttagtta acgcttctac aagtattttc 1440 
aagtgtcatg agaaaggatt atcacacctg 1500 

1524 



<210> 41 
<211> 2973 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7758505CB1 



<400> 41 

cgagccgggt ggctcgacca aggagtgggt gtgggtgggt ctttcggggc tcccggcgac 60 
ccgcggtgcc gacgcaacgc cgacgtatgg tgccaagcga actttaaaaa gctgcttcgg 120 
acaaaccaga gccaggattt ccactgtcgg ggacccggga tcggaagggt ctagcccgag 180 
ggaaatgctg gaagatccca tcggccagtg accagcaact ttccggcgag attttgacgc- 240 
ggagaactgt gctctgcctc ctcttattct cccaaagctc acgttggcgt cctgccttgc 300 
gggggaactc ggcgcgctct ctgcctgagc agcgagtgaa ttgaacccca gcccgctccg 360 
gcgcctccgg gctgatgagt gtcgctctcc gcccgtccat ctctttttcc cggaggtaaa 420 
ggcccgcggt cccccacctt cagtgcgccc gggttccaag cgccggagcc agcgttttgg 480 
cggagccgct tcttggatgc tgaaggctgg gctcctccat cgtgggtgcc gaggcggcga 540 
tgggtgtcct caaagtgtgg ctcgggctgg ccctagcgtt ggcggaattt gcagtattgc 600 
ctcatcattc cgaaggtgct tgtgtctatc aggattcctt gttggcggat gccacaattt 660 
ggaagcccga ttcatgccag agctgccgtt gccatggtga tattgttatc tgcaaacctg 720 
ctgtttgcag aaaccctcaa tgtgcctttg agaagggaga agtgcttcaa atagctgcca 780 
accaatgctg tcctgagtgt gttttgagga ctccaggatc ttgccatcat gaaaagaaaa 840 
tccatgagca tgggacagaa tgggcctctt ctccatgtag tgtgtgctct tgcaatcatg 900 
gggaagtccg atgtaccccc caaccatgcc caccgctgtc atgtggacac caggagctgg 960 
cattcatccc tgaaggaagc tgctgcccag tttgtgtggg ccttgggaaa ccctgttcct 1020 
atgaaggcca tgtgtttcag gatggggagg actggcggct gagccggtgt gccaaatgtc 1080 
tgtgtagaaa tggggttgcc cagtgcttca cagctcagtg tcagcctcta ttttgtaacc 1140 
aggatgagac tgtagtccga gtccctggaa aatgttgccc gcagtgctct gcaagatcct 1200 
gctctgcagc tggccaagta tacgagcatg gtgagcagtg gagcgaaaat gcctgcacca 1260 
cgtgtatatg tgaccggggt gaggtcaggt gtcacaagca ggcctgcctg cccctgagat 1320 
gcggaaaggg tcagagcagg gctcggcgtc atgggcaatg ctgtgaggaa tgtgtgtctc 13 80 
ctgccgggag ctgctcctat gatggagttg tgcggtacca ggacgaaatg tggaagggct 1440 
cggcctgtga gttctgcatg tgtgatcatg gccaagtgac ctgccagact ggagagtgtg 1500 
ccaaagtgga gtgtgcccgg gatgaagaat taattcactt agatggaaag tgttgtcctg 1560 
aatgcatttc aaggaatggt tattgtgttt atgaagaaac tggagaattt atgtcatcaa 1620 
atgctagtga agttaaacgt attccagagg gagagaagtg ggaagatggc ccttgcaagg 1680 
tgtgtgagtg ccgaggggct caggtaactt gctacgagcc ctcttgccca ccatgtccag 1740 
tgggcacact ggccttagag gtgaagggac agtgctgtcc agactgcaca tcagttcatt 1800 
gccatccaga ttgtttgaca tgctctcagt ctccagacca ctgtgacctc tgccaagatc 1860 
ctaccaagtt actgcagaat ggatggtgtg tgcacagctg tggactgggt ttttaccaag 1920 
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ctggcagtct ctgtatagcc tgccagcccc agtgctccac gtgtaccagt gggctggagt 1980 
gctcatcctg ccagcctccc ctgctgatgc ggcacgggca gtgtgtgcct acctgtgggg 2040 
acggcttcta ccaagatcgc cattcctgtg cagtctgcca tgagtcctgt gcaggttgct 2100 
ggggcccaac ggagaagcac tgcttggcct gcagagatcc cctccacgtg ctgagagatg 2160 
gcggctgtga gagcagctgt ggaaaaggct tctacaacag gcagggcacc tgtagcgctt 2220 
gtgaccaatc ctgtgacagt tgtggcccca gtagccccag gtgtcttacc tgtactgaga 2280 
agacagtgct gcatgatggg aaatgcatgt ctgaatgccc tggcgggtac tatgctgatg 2340 
ccactggcag gtgcaaagtt tgtcataact catgtgccag ctgctctggg cccacaccct 2400 
ctcactgtac agcctgcagc ccccccaagg ctctgcgtca aggccactgt ctgccccgct 2460 
gtggagaggg tttctactct gaccacggag tctgcaaagc ctgtcactcc tcctgcctgg 2520 
cttgtatggg tcccgcaccc tctcactgta ctgggtgtaa gaagccagag gaaggactgc 2580 
aagtggagca gctgtctggc gtgggcatcc cctctggcga gtgtctagcc cagtgtagag 2640 
cccattttta cttggagagc actggcctat gtgaagggca aaatctggac ttctgtcaga 2700 
atttagaagt gatttctgct gtttgccttg gcatatcatc tacagagaat tgatgacatc 2760 
ctgaataaat aatttgactc aatagccagg ccatctatga gtggttgagg agatgaaagg 2820 
gaagtattat agtttccttt ctgttcccac aagtagcctt gctgttgggt gaatagtttg 2880 
actctaaagc tacgtgaaaa aaaaatcatt agtttgtatt tttcattgta aacatatgtt 2940 
cattaaaaaa attttataat acacccacta cct 2973 

<210> 42 
<211> 2126 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 6885756CB1 

<400> 42 

gaggagtaat ttgattcagg tgttctagaa gtcatgatgt gggctgtgtc tgttgaattc 60 
ccagcgatgc aaggggacac accctgtgac tcattcctta attgagtgct gatatttgat 120 
tggtttatcg cacacctgat gggtgggtgg ggtgttcgcg gttggagggg gtgagttata 180 
taagggctga tgcggccaga gagctggtca tttgaagact ctctcggaag agatagcgtc 240 
ttgctgcaac ctgcggtccc agcagaaaaa ccttgtgatc cttgttgcgg gcgacatgga 300 
agacgactca ctctatttgg gaggtgactg gcagttcaat cacttttcaa aactcacatc 360 
ttctcggcta gatgcagctt ttgctgaaat . ccagcggact tctctctctg aaaagtcacc 420 
actctcatct gagacccgtt tcgacctctg tgatgatttg gctcctgtgg caagacagct 480 
tgctcccagg gagaagcttc ctctgagtag caggagacct gctgcggtgg gggctgggct 540 
ccagaagata ggaaatacct tctatgtgaa cgtttccctg cagtgcctga catacacact 600 
gccgctttcc aactacatgc tgtcccggga ggactctcaa acgtgtcatc ttcacaagtg 660 
ctgcatgttc tgtactatgc aagctcacat cacatgggcc ctctaccgtc ctggccatgt 720 
catccagccc tcacaggtat tggctgctgg cttccataga ggtgagcagg aggatgccca 7 80 
tgaatttctc atgtttactg tggatgccat gaaaaaggca tgccttcccg ggcacaagca 840 
gctagatcat cactccaagg acaccaccct catccaccaa atatttggag cgtattggag 900 
atctcaaatc aagtatctcc actgccacgg catttcagac acctttgacc cttacctgga 960 
catcgccctg gatatccagg cagctcagag tgtcaagcaa gctttggaac agttggtgaa 1020 
gcccaaagaa ctcaatggag agaatgccta tcattgtggt ctttgtctcc agaaggcgcc 1080 
tgcctccaag acgttaactt tacccacttc tgccaaggtc ctcattcttg tattgaagag 1140 
attctccgat gtcacaggca acaaacttgc caagaatgtg caatatccta agtgccgtga 1200 
catgcagcca tacatgtctc agcagaacac aggacctctt gtctatgtcc tctatgctgt 1260 
gctggtccac gctgggtgga gttgtcacaa cggacattac ttctcttatg tcaaagctca 1320 
agaaggccag tggtataaaa tggatgatgc cgaggtcact gcctctggca tcacctctgt 1380 
cctgagtcaa caggcctatg tcctctttta catccagaag agtgaatggg aaagacacag 1440 
tgagagtgtg tcaagaggca gggaaccaag agcccttggt gctgaagaca cagacaggcc 1500 
agcaacgcaa ggagagctca agagagacca cccttgcctc caggtacccg agttggacga 1560 
gcacttggtg gaaagagcca ctcaggaaag caccttagac cactggaaat tcccccaaaa 1620 
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gcaaaacaaa acgaagcctg agttcaacgt 
cgtacttgtg attcatcaat caaaatacaa 
gcaaagctcc ctgctaaacc tctcttcgac 
tggcacactc gcttctctgc aagggagcac 
caagagatct ctgcttgtgt gccagtgatc 
tgcacacaca cacacacaca caaacacaaa 
cacacccaca caaacacgaa caccgtcaat 
ctgtctctac aacagggaca attggatagt 
tgggaaacat caagttgggg gttcag 



cagaaaagtt gaaggtaccc tgcctcccaa 1680 
gtgtggtatg aaaaaccatc atcctgaaca 1740 
gaaaccgaca gatcaggagt ccatgaacac 1800 
caggagatcc aaagggaata acaaacacag 1860 
acagtggaag taccgaccca cactgagggg 1920 
tacacccaca agcgcgcacg gaaacacaca 1980 
cctacataaa gtaatgagga gccccagttt 2040 
gatggctgcg tctcaggatg agcccacaca 2100 

2126 



<210> 43 

<211> 1973 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500748CB1 



<400> 43 

ggaagtagcc gcaggcatgg cggcggctat gccgcttgct ctgctcgtcc tgttgctcct 60 
ggggcccggc ggctggtgcc ttgcagaacc cccacgcgac agcctgcggg aggaacttgt 120 
catcaccccg ctgccttccg gggacgtagc cgccacattc cagttccgca cgcgctggga 180 
ttcggagctt cagcgggaag gagtgtccca ttacaggctc tttcccaaag ccctggggca 240 
gctgatctcc aagtattctc tacgggagct gcacctgtca ttcacacaag gcttttggag 300 
gacccgatac tgggggccac ccttcctgca ggccccatca ggtgcagagc tgtgggtctg 360 
gttccaagac actgtcactg atgtggataa atcttggaag gagctcagta atgtcctctc 420 
agggatcttc tgcgcctctc tcaacttcat cgactccacc aacacagtca ctcccactgc 480 
ctccttcaaa cccctgggtc tggccaatga cactgaccac tactttctgc gctatgctgt 540 
gctgccgcgg gaggtggtct gcaccgaaaa cctcaccccc tggaagaagc tcttgccctg 600 
tagttccaag gcaggcctct ctgtgctgct gaaggcagat cgcttgttcc acaccagcta 660 
ccactcccag gcagtgcata tccgccctgt ttgcagaaat gcacgctgta ctagcatctc 720 
ctgggagctg aggcagaccc tgtcagttgt atttgatgcc ttcatcacgg ggcagggaaa 780 
gaaagactgg tccctcttcc ggatgttctc ccgaaccctc acggagccct gccccctggc 840 
ttcagagagc cgagtctatg tggacatcac cacctacaac caggacaacg agacattaga 900 
ggtgcaccca cccccgacca ctacatatca ggacgtcatc ctaggcactc ggaagaccta 960 
tgccatctat gacttgcttg acaccgccat gatcaacaac tctcgaaacc tcaacatcca 1020 
gctcaagtgg aagagacccc cagagaatgg ttacatccac taccagcctg cccaggaccg 1080 
gctgcaaccc cacctcctgg agatgctgat tcagctgccg gccaactcag tcaccaaggt 1140 
ttccatccag tttgagcggg cgctgctgaa gtggaccgag tacacaccag atcctaacca 1200 
tggcttctat gtcagcccat ctgtcctcag cgcccttgtg cccagcatgg tagcagccaa 1260 
gccagtggac tgggaagaga gtcccctctt caacagcctg ttcccagtct ctgatggctc 1320 
taactacttt gtgcggctct acacggagcc gctgctggtg aacctgccga caccggactt 1380 
cagcatgccc tacaacgtga tctgcctcac gtgcactgtg gtggccgtgt gctacggctc 1440 
cttctacaat ctcctcaccc gaaccttcca catcgaggag ccccgcacag gtggcctggc 1500 
caagcggctg gccaacctta tccggcgcgc ccgaggtgtc cccccactct gattcttgcc 1560 
ctttccagca gctgcagctg ccgtttctct ctggggaggg gagcccaagg gctgtttctg 1620 
ccacttgctc tcctcagagt tggcttttga accaaagtgc cctggaccag gtcagggcct 1680 
acagctgtgt tgtccagtac aggagccacg agccaaatgt ggcatttgaa tttgaattaa 1740 
cttagaaatt catttcctca cctgtagtgg ccacctctat attgaggtgc tcaataagca 1800 
aaagtggtcg gtggctgctg tattggacag cacagaaaaa gatttccatc accacagaaa 1860 
ggtcggctgg cagcactggc caaggtgatg gggtgtgcta cacagtgtat gtcactgtgt 1920 
agtggatgga gtttactgtt tgtggaataa aaacggctgt ttccgtggaa aaa 1973 



<210> 44 
<211> 1884 
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<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_.f eature 

<223> Incyte ID. No: 7500749CB1 

<400> 44 

ggcatggcgg cggctatgcc gcttgctctg 
tggtgccttg cagaaccccc acgcgacagc 
ccttccgggg acgtagccgc cacattccag 
cgggaaggag acactgacca ctactttctg 
tgcaccgaaa acctcacccc ctggaagaag 
tctgtgctgc tgaaggcaga tcgcttgttc 
atccgccctg tttgcagaaa tgcacgctgt 
ctgtcagttg tatttgatgc cttcatcacg 
cggatgttct cccgaaccct cacggagccc 
gtggacatca ccacctacaa ccaggacaac 
actacatatc aggacgtcat cctaggcact 
gacaccgcca tgatcaacaa ctctcgaaac 
ccagagaatg aggccccccc agtgcccttc 
gggctgcaga agggggagct gagcacactg 
ccggtgctgc tgctggacac cgtaccctgg 
atcacctcca agggcaagga gaacaaacca 
cggctgcaac cccacctcct ggagatgctg 
gtttccatcc agtttgagcg ggcgctgctg 
catggcttct atgtcagccc atctgtcctc 
aagccagtgg actgggaaga gagtcccctc 
tctaactact ttgtgcggct ctacacggag 
ttcagcatgc cctacaacgt gatctgcctc 
tccttctaca atctcctcac ccgaaccttc 
gccaagcggc tggccaacct tatccggcgc 
ccctttccag cagctgcagc tgccgtttct 
tgccacttgc tctcctcaga gttggctttt 
ctacagctgt gttgtccagt acaggagcca 
aacttagaaa ttcatttcct cacctgtagt 
caaaagtggt cggtggctgc tgtattggac 
aaggtcggct ggcagcactg gccaaggtga 
gtagtggatg gagtttactg tttgtggaat 
aaaaaaaaaa aaaaaaaaaa aaag 



ctcgtcctgt tgctcctggg gcccggcggc 60 
ctgcgggagg aacttgtcat caccccgctg 120 
ttccgcacgc gctgggattc ggagcttcag 180 
cgctatgctg tgctgccgcg ggaggtggtc 240 
ctcttgccct gtagttccaa ggcaggcctc 300 
cacaccagct accactccca ggcagtgcat 360 
actagcatct cctgggagct gaggcagacc 420 
gggcagggaa agaaagactg gtccctcttc 480 
tgccccctgg cttcagagag ccgagtctat 540 
gagacattag aggtgcaccc acccccgacc 600 
cggaagacct atgccatcta tgacttgctt 660 
ctcaacatcc agctcaagtg gaagagaccc 720 
ctgcatgccc agcggtacgt gagtggctat 780 
ctgtacaaca cccacccafca ccgggccttc 840 
tatctgcggc tgtatgtgca caccctcacc 900 
agttacatcc actaccagcc tgcccaggac 960 
attcagctgc cggccaactc agtcaccaag 1020 
aagtggaccg agtacacacc agatcctaac 1080 
agcgcccttg tgcccagcat ggtagcagcc 1140 
ttcaacagcc tgttcccagt ctctgatggc 1200 
ccgctgctgg tgaacctgcc gacaccggac 1260 
acgtgcactg tggtggccgt gtgctacggc 1320 
cacatcgagg agccccgcac aggtggcctg 1380 
gcccgaggtg tccccccact ctgattcttg 1440 
ctctggggag gggagcccaa gggctgtttc 1500 
gaaccaaagt gccctggacc aggtcagggc 1560 
cgagccaaat gtggcatttg aatttgaatt 1620 
ggccacctct atattgaggt gctcaataag 1680 
agcacagaaa aagatttcca tcaccacaga 1740 
tggggtgtgc tacacagtgt atgtcactgt 1800 
aaaaacggct' gtttccgtgg aaaaaaaaaa 1860 

1884 



<210> 45 
<211> 1581 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> mi sc_f eature 

<223> Incyte ID No: 7503401CB1 



<400> 45 

cgccaccacc tcagctgcgg accgaggcga 
tgcgcaaggg ggcgagcccg ggcagccggc 
gccccagcag cagcacaagg aagagatggc 
catggacgac gggtttgtga gcctggactc 
agaatgggct gatatagatc cggtgccgca 



gatggcggcc accgaggggg tcggggaggc 60 
gcaacccccg ccccagccgc acccaccgcc 120 
ggccgaggct ggggaagccg tggcgtcccc 180 
gccctcctat gtcctgtaca gggacagagc 240 
gaatgatggc cccaatcccg tggtccagat 300 
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catttatagt gacaaattta gagatgttta tgattacttc cgagctgtcc tgcagcgtga 360 
tgaaagaagt gaacgagctt ttaagctaac ccgggatgct attgagttaa atgcagccaa 420 
ttatacagtg tggcatcata ggcgagtatt agtggaatgg ctaagagatc catctcagga 480 
gcttgaattt attgctgata ttcttaatca ggatgcaaag aattatcatg cctggcagca 540 
tcgacaatgg gttattcagg aatttaaact ttgggataat gagctgcagt atgtggacca 600 
acttctgaaa gaggatgtga gaaataactc tgtctggaac caaagatact tcgttatttc 660 
taacaccact ggctacaatg atcgtgctgt attggagaga gaagtccaat acactctgga 720 
aatgattaaa ctagtaccac ataatgaaag tgcatggaac tatttgaaag ggattttgca 780 
ggatcgtggt ctttccaaat atcctaatct gttaaatcaa ttacttgatt tacaaccaag 840 
tcatagttcc ccctacctaa ttgcctttct tgtggatatc tatgaagaca tgctagaaaa 900 
tcagtgtgac aataaggaag acattcttaa taaagcatta gagttatgtg aaatcctagc 960 
taaagaaaag gacactataa gaaaggaata ttggagatac attggaagat cccttcaaag 1020 
caaacacagc acagaaaatg actcaccaac aaatgtacag caataacacc atccagaaga 1080 
acttgatgga atgcttttat tttttattaa gggaccctgc aggagtttca cacgagagtg 1140 
gtccttccct ttgcctgtgg tgtaaaagtg catcacacag gtattgcttt ttaacaagaa 1200 
ctgatgctcc ttgggtgctg ctgctactca gactagctct aagtaatgtg attcttctaa 1260 
agcaaagtca ttggatggga ggaggaagaa aaagtcccat aaaggaactt ttgtagtctt 1320 
atcaacatat aatctaatcc cttagcatca gctcctccct cagtggtaca tgcgtcaaga 1380 
tttgtagcag taataactgc aggtcacttg tatgtaatgg atgtgaggta gccgaagttt 1440 
ggttcagtaa gcagggaata cagtcgttcc atcagagctg gtctgcacac tcacattatc 1500 
ttgctatcac tgtaaccaac taatgccaaa agaacggttt tgtaataaaa ttatagctgt 1560- 
atctaaaaaa aaaaaaaaag g 1581 

<210> 46 
<211> 1996 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7503485CB1 

<400> 46 

gttcagcccc cgtctacact ggggtggtgc ttagccggcg ccagaccgac cctcgacttc 60 
ggagaggcag cgcggttcct ctgggtgctt ccgcctcccc ttctcctgct tctccagcct 120 
cttcggcctc ctcgcccgcc gcgggaaccc gagaccccag tgtatgcccc acccctgacc 180 
ccgctcgcga catgtccacc ccggctcggc ggcgcctcat gcgggacttc aagaggttgc 240 
aggaggatcc tccagccgga gtcagcgggg ctccgtccga gaacaacata atggtgtgga 300 
acgcggtcat tttcgggcct gaagggaccc cgtttgagga tgtctatgca gatggtagta 360 
tatgtctgga catacttcag aaccgttgga gtccaaccta tgatgtgtct tccattctaa 420. 
catccataca gtctctgttg gatgaaccca atcccaatag tccagcaaac agccaggctg 480 
ctcagctgta ccaggagaac aaacgggaat atgaaaagcg tgtttctgca atagtagaac 540 
aaagctggcg tgattgttga ccccgggtac agtttaaaga agctggccat aagaaaaata 600 
tatattgatg tgtttgtcac ctccctactc ctgtcattac atttacttta ttaaaagcaa 660 
aataactgtt gtgctgtttc catcttcctt gccaagtttt cctacccctt ctaccctctc 720 
cttaaacatc agaaaacacc ctctatgaaa tcaaatgtac tgtacctggg ttacttgcaa 780 
aaattactaa tgcttcagtt tttctgttgt atttcatttc cagttttcag gcagttattt 840 
ttattattgt actttaagct tttaagatga attgttatac aagaggtgct tatgcttagc 900 
ttgatgacca ggatgttatt tttaacaaaa tgattgctga agtgtttcat cctggctggt 960 
ccttcacttg tgttggattt agaagtgaat gtgtttggaa tatggcctac agagaataga 1020 
aacaaatcca tgtaaacaat tttgaaggag gcatgggagc taaaaatcct gtgatactaa 1080 
gatctcagtc atatgaatta caacgtagta tttactggca agaaggagaa agttgaagga 1140 
ctcagctaaa ggagtacagc aattgtagta actgacacat cctctctttg caagctgctg 1200 
actgggcaca ctcatgccaa gtttcagaat tattggtctt ctgggttttt gctttttaaa 1260 
agaggtgtgg gagcagagga atggaaacaa tcgtgagttt ttgagctagg gaaagttgga 1320 
gctcctttaa tctttttaaa ggatcagtgc tgccctaagt gaataaactc aattgtccat 1380 
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ctttatttta gagttttaat gaattcaagg aagggagcat agcatatctg tggcaaacta 1440 
ttttccactc aaatcctgag ttattgctgc atgctttaat ttcttccctt tcagcatctg 1500 
agaaccttaa agccaatgtc tgcgatcttt ttttggatat ttatactttt agatatatag 1560 
tacctttaag tagcagtatg ggacaaggct tgtaaatgtt ttgtctaatg ttctattgtc 1620 
accttttatg catttatcac ttccaaatct aactttgcac aagtaaccca tgtaaaaaaa 1680 
aaatgtacat ttttcaaaag ttgtaaataa aaataacctt aaaatttcaa aaaaaaaaaa 1740 
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1800 
aaaaaaaatt ttttcttgtg ggacaaaaaa aaaaaagcgg cggtgtgtgt gtgttttaat 1860 
agccaaacaa aatttttttt cggtatattt tggggggtgg cccccccttt tttagcaaaa 1920 
tgagaaaaac tgtacacgat gtataaaccg cggaggagga ataaaaatat ttttatcaaa 1980 
aagaaaaaaa ttatca 1996 

<210> 47 
<211> 1232 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7504076CB1 

<400> 47 

ggcaagcgca ggtcaggcgg tgcagctggg cggccagcgg atcgtgccgc ggcggccgag 60 
cgcagctaca ggagggtgtc cagaagccac aagccatggc tgtggggaac atcaacgagc 120 
tgcccgagaa catcctgctg gagctgttca cgcacgtgcc cgcccgccag ctgctgctga 180 
actgccgcct ggtctgcagc ctctggcggg acctcatcga cctcgtgacc ctctggaaac 240 
gcaagtgcct gcgagagggc ttcatcactg aggactggga ccagcccgtg gccgactgga 300 
agatcttcta cttcttacgg agcctgcaca ggaacctcct gcacaacccg tgcgctgaag 360 
aggggttcga gttctggagc ctggatgtga atggaggcga tgagtggaag gtggaggatc 420 
tctctcgaga ccagaggaag gaattcccca atgaccaggt caagaaatac ttcgttactt 480 
catattacac ctgcctcaag tcccaggtgg tggacctcaa ggccgaaggg tattgggagg 540 
agctgatgga taccacacgg ccggacatcg aggtcaagga ctggttcgca gccaggccag 600 
attgcgggtc caagtaccag ctgtgcgttc agctcctgtc gtccgcgcac gcgcctctgg 660 
ggaccttcca gccagacccg gcgaccatcc agcagaagag cgatgccaag tggagggagg 720 
tctcccacac attctccaac tacccgcccg gcgtccgcta catctggttt cagcacggcg 780 
gcgtggacac tcattactgg gccggctggt acggcccgag ggtcaccaac agcagcatca 840 
ccatcgggcc cccgctgccc tgacaccccc tgagccccca tctgctgaac cctgactgct 900 
ttacggacat tggatgaagc cgaagcattt agaatggtgc ctggcacaca gttggtgcgt 960 
gatatggtta agctttgtgt ccccacccac atctcatctt gaatgtgacg gtttccccgg 1020 
ctccctcctg ccgccatgtg aagaaggtcg ttgcttcccc ttcaccttcc accaccatga 1080 
tttagagatg gagtttcacc atgattggac acagggtggt ctcaatctcc tgaacctcgt 1140 
gatccaccca cctcgacctc ccatagtgct gagattaaca tggcgtggcc accgcgctct 1200 
accgcttgtg tcttgacgcg tccccagcct ca 1232 

<210> 48 

<211> 810 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500926CB1 

<400> 48 

ggggtccgga actgcttgtt ccggcagtgg aagagacgcg ccggcgttgg ccgctgctgc 60 
tagcagcttg aaccccaggg tcgggaccga tgtcggcttg ggctgctgcc agcctaagca 120 
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gggccgctgc ccgatgcttg ctggcacgag 
acccccggcc ctcccacccc gagccccggg 
actttaccgc ggctgtcccc gccgggcaca 
gtccgaagga cgtcgaaagg agtcgcatct 
cagtgaaaga aggaggcccc aaccctgagc 
tgtgtcgcag caaacatatg cccaagtcaa 
ccaaggacac ttatttgctg tatgagggtc 
aggcattatc taacagtagc cacaagtgcc 
atctcacttg tggggcctcc ttgtcagctc 
gtcccgacac cctctcggat gcagggcagg 
gtagctggcc tctgtgggga ttgtaagtgc 
taatagtaac ggtgattatt ggttgctgca 



gccccggggt cagggcggct cctccgcgcg 180 
gctgcggtgc cgctccgggc aggacgctgc 240 
acaagtggtc caaagtcagg cacatcaagg 300 
tctccaaact ctgtttgaac atccgcctgg 360 
acaacagcaa cctggccaat atcttagagg 420 
cgattgagac agcactgaaa atggagaaat 480 
gaggccctgg tggctcttct ctgctcatcg 540 
aagcagactt gcgaccttga agccaaagga 600 
tgctgctgtc tcagagccat ctggatgagt 660 
accacccagc tggtcagact ctgatgttgg 720 
cctgaggcgc tctgtactag aaactgctct 780 

810 



<210> 49 

<211> 2625 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7503216CB1 



<400> 49 

gcggcggcgg cagcagcagc agcagcatta gcagcagcag cttctctcaa tgctgagtag 60 
tatatactca agataaatga aaacacatgt ccacacaaaa gcttgctcat gaattgtcat 120 
agaaacattc ttcacaatat ccaaaagtgg aagcaaccca aatgtccatc aactgatgaa 180 
tggataaaca aagtaatatt ctgtacgatg gaattttaat tggcatattt ggtaataaaa 240 
aggaatcaag tactgataca tcctacaaca tggatacatc ttgaaaatgc tatgctaaat 300 
gaaagagggt gaatcaccag gattttcaac tgagaaattt aagaataatt gaacctaacg 360 
aggtgacaca ctcaggagac acaggtgtgg aaacagacgg cagaatgcct ccaaaggtga 420 
cttcagagct gcttcggcag ctgagacaag ccatgaggaa ctctgagtat gtgaccgaac 480 
cgatccaggc ctacatcatc ccatcgggag atgctcatca gagtgagtat attgctccat 540 
gtgactgtcg gcgggctttt gtctctggat tcgatggctc tgcgggcaca gccatcatca 600 
cagaagagca tgcagccatg tggactgacg ggcgctactt tctccaggct gccaagcaaa 660 
tggacagcaa ctggacactt atgaagatgg gtctgaagga cacaccaact caggaagact 720 
ggctggtgag tgtgcttcct gaaggatcca gggttggtgt ggaccccttg atcattccta 780 
cagattattg gaagaaaatg gccaaagttc tgagaagtgc cggccatcac ctcattcctg 840 
tcaaggagaa cctcgttgac aaaatctgga cagaccgtcc tgagcgccct tgcaagcctc 900 
tcctcacact gggcctggat tacacagggc tatttaatct ccgaggatca gatgtggagc 960 
acaatccagt atttttctcc tacgcaatca taggactaga gacgatcatg ctcttcattg 1020 
atggtgaccg catagacgcc cccagtgtga aggagcacct gcttcttgac ttgggtctgg 1080 
aagccgaata caggatccag gtgcatccct acaagtccat cctgagcgag ctcaaggccc 1140 
tgtgtgctga cctctcccca agggagaagg tgtgggtcag tgacaaggcc agctatgctg 1200 
tgagcgagac catccccaag gaccaccgct gctgtatgcc ttacaccccc atctgcatcg 1260 
ccaaagctgt gaagaattca gctgagtcag aaggcatgag gcgggctcac attaaagatg 1320 
ctgttgctct ctgtgaactc tttaactggc tggagaaaga ggttcccaaa ggtggtgtga 1380 
cagagatctc agctgctgac aaagctgagg agtttcgcag gcaacaggca gactttgtgg 1440 
acctgagctt cccaacaatt tccagtacgg gacccaacgg cgccatcatt cactacgcgc 1500 
cagtccctga gacgaatagg accttgtccc tggatgaggt gtaccttatt gactcgggtg 1560 
ctcaatacaa ggatggcacc acagatgtga cgcggacaat gcattttggg acccctacag 1620 
cctacgagaa ggaatgcttc acatatgtcc tcaagggcca catagctgtg agtgcagccg 1680 
ttttcccgac tggaaccaaa ggtcaccttc ttgactcctt tgcccgttca gctttatggg 1740 
attcaggcct agattacttg cacgggactg gacatggtgt tgggtctttt ttgaatgtcc 1800 
atgagggtcc ttgcggcatc agttacaaaa cattctctga tgagcccttg gaggcaggca 1860 
tgattgtcac tgatgagccc gggtactatg aagatggggc ttttggaatt cgcattgaga 1920 
atgttgtcct tgtggttcct gtgaagacca agtataattt taataaccgg ggaagcctga 1980 
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cctttgaacc tctaacattg gttccaattc 
cagacaaaga gtgcgactgg ctcaacaatt 
aggaattgca gaaacagggc cgccaggaag 
ccatctccaa acagcattaa taaatacctc 
ggaaggaaga aacgtggcag atccctgaca 
ctcccctttt tactttagac tttaagaaga 
ttattgcaaa cactcagtct tttatgattt 
attgctgcac cagaaggagg gtccctccaa 
cgacttcttt ggccagtgat ggggaatcag 
gctagtacat cattcatgat caccttaatg 
aaaaatgtca gaactgtgaa aaaaaaaaaa 



agaccaaaat gatagatgtg gattctctta 2040 
accacctgac ctgcagggat gtgattggga 2100 
ctctcgagtg gctcatcaga gagacgcaac 2160 
cccggttttg tttttgtaaa atgctctgga 2220 
tctttcccct ttcctttcct tcttccctac 2280 
acagaaaatc ttcttatcct ctttgatatt 2340 
tttaattgtt gagaacaagc caagaataaa 2400 
agttgaacac ttggtgaaag gaagatgccc 2460 
tgagtgctcc atgatggtca tgttccaggt 2520 
ctcatgagac tatatttatg atcagtgaat 2580 
aaaaaaaaaa aaaag 2625 



<210> 50 

<211> 2432 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7503233CB1 



<400> 50 

gcgctcttcc tggttgggcc ctgccctgag ctgccaccgg gaagccagcc tcagggactg 60 
cagcgacccc caaacacccc tcccccagga tgtcggagga gatcatcacg ccggtgtact 120 
gcactggggt gtcagcccaa gtgcagaagc agcgggccag ggagctgggc ctgggccgcc 180 
atgagaatgc catcaagtac ctgggccagg attatgagca gctgcgggtg cgatgcctgc 240 
agagtgggac cctcttccgt gatgaggcct tccccccggt accccagagc ctgggttaca 300 
aggacctggg tcccaattcc tccaagacct atggctatgc cggcatcttc catttccagc 360 
tgtggcaatt tggggagtgg gtggacgtgg tcgtggatga cctgctgccc atcaaggacg 420 
ggaagctagt gttcgtgcac tctgccgaag gcaacgagtt ctggagcgcc ctgcttgaga 480 
aggcctatgc caaggtaaat ggcagctacg aggccctgtc agggggcagc acctcagagg 540 
gctttgagga cttcacaggc ggggttaccg agtggtacga gttgcgcaag gctcccagtg 600 
acctctacca gatcatcctc aaggcgctgg agcggggctc cctgctgggc tgctccatag 660 
acatctccag cgttctagac atggaggcca tcactttcaa gaagttggtg aagggccatg 720 
cctactctgt gaccggggcc aagcaggtga actaccgagg ccaggtggtg agcctgatcc 780 
ggatgcggaa cccctggggc gaggtggagt ggacgggagc ctggagcgac agctcctcag 840 
agtggaacaa cgtggaccca tatgaacggg accagctccg ggtcaagatg gaggacgggg 900 
agttctggat gtcattccga gacttcatgc gggagttcac ccgcctggag atctgcaacc 960 
tcacacccga cgccctcaag agccggacca tccgcaaatg gaacaccaca ctctacgaag 1020 
gcacctggcg gcgggggagc accgcggggg gctgccgaaa ctacccagcc accttctggg 1080 
tgaaccctca gttcaagatc cggctggatg agacggatga cccggacgac tacggggacc 1140 
. gcgagtcagg ctgcagcttc gtgctcgccc ttatgcagaa gcaccgtcgc cgcgagcgcc 1200 
gcttcggccg cgacatggag actattggct tcgcggtcta cgaggtccct ccggagctgg 1260 
tgggccagcc ggccgtacac ttgaagcgtg acttcttcct ggccaatgcg tctcgggcgc 1320 
gctcagagca gttcatcaac ctgcgagagg tcagcacccg cttccgcctg ccacccgggg 1380 
agtatgtggt ggtgccctcc accttcgagc ccaacaagga gggcgacttc gtgctgcgct 1440 
tcttctcaga gaagagtgct gggactgtgg agctggatga ccagatccag gccaatctcc 1500 
ccgatgagca agtgctctca gaagaggaga ttgacgagaa cttcaaggcc ctcttcaggc 1560 
agctggcagg ggaggacatg gagatcagcg tgaaggagtt gcggacaatc ctcaatagga 1620 
tcatcagcaa acacaaagac ctgcggacca agggcttcag cctagagtcg tgccgcagca 1680 
tggtgaacct catggatcgt gatggcaatg ggaagctggg cctggtggag ttcaacatcc 1740 
tgtggaaccg catccggaat tacctgtcca tcttccggaa gtttgacctg gacaagtcgg 1800 
gcagcatgag tgcctacgag atgcggatgg ccattgagtc ggcaggcttc aagctcaaca 1860 
agaagctgta cgagctcatc atcacccgct actcggagcc cgacctggcg gtcgactttg 1920 
acaatttcgt ttgctgcctg gtgcggctag agaccatgtt ccgatttttc aaaactctgg 1980 
acacagatct ggatggagtt gtgacctttg acttgtttaa gtggttgcag ctgaccatgt 2040 
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ttgcatgagg cagggactcg gtcccccttg ccgtgctccc ctccctcctc gtctgccaag 2100 
cctcgcctcc taccacacca caccaggcca ccccagctgc aagtgccttc cttggagcag 2160 
agaggcagcc tcgtcctcct gtcccctctc ctcccagcca ccatcgttca tctgctccgg 2220 
gcagaactgt gtggcccctg cctgtgccag ccatgggctc gggatggact ccctgggccc 2280 
cacccattgc caagccagga aggcagcttt cgcttgttcc tgcctcggga cagccccggg 2340 
tttccccagc atcctgatgt gtcccctctc cccacttcag aggccaccca ctcagcacca 2400 
acgggcttgg ccttgcttgc agactataaa ct 2432 

<210> 51 
<211> 3969 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 

<223> Incyte ID No: 7726576CB1 



<400> 51 

ccgggcaggg tcgcctctag gtgctcacct ccgccacttc gccatggcgg gtcctggccc 60 

gggcgcggtg ctggagtccc cccggcagct gctgggccgc gtgcgcttct tggcagaggc 120 

agcgcggagc ctccgcgccg ggcggccgct gccagcagcg ctggctttcg tgccgcgaga 180 

ggtgctctac aagctttaca aggacccagc gggaccgtcg cgcgtgcttc tgccggtgtg 240 

ggaggcagag ggcctggggc tgcgtgtggg cgccgcaggc ccagcccccg gtaccggctc 300 

cgggcccctc cgcgccgccc gcgacagcat tgagctccgg cgcggcgcct gcgtgcgcac 360 

cacgggcgag gagctgtgca atggccacgg gctctgggtg aagctgacaa aggagcagct 420 

ggcagagcac ctgggcgact gcgggctgca ggaaggctgg ctgctggtgt gccgcccggc 480 

ggagggcgga gcccgcctgg tacccatcga cactcccaac cacctccagc ggcagcagca 540 

gctctttggc gtggattatc ggccggtgct caggtgggaa caggtggtgg acctgacata 600 

ctcacatcgc ctgggatcga gacctcagcc ggcagaggca tacgcagaag ctgtacaaag 660 

gctactctat gtacccccga catggaccta cgagtgcgac gaggacctga tccacttctt 720 

gtatgaccac ctgggcaagg aggatgagaa cctgggtagc gtgaagcagt atgtggagag 780 

catagacgtt tcctcctaca cggaggagtt caacgtgtcc tgcctgacag acagcaatgc 840 

cgatacctac tgggagagcg atgggtccca gtgccaacac tgggtacggc ttactatgaa 900 

gaagggcacc attgtcaaga agctgctact cacagtggat accacagatg acaactttat 960 

gccaaagcgg gtggtggtct atgggggtga aggggacaac ctgaagaagc tgagtgacgt 1020 

gagcattgac gagaccctca tcggggatgt ctgtgtcctg gaggacatga ccgtccacct 1080 

cccgatcatc gagatccgca tcgtggagtg ccgagatgat gggattgatg ttcgtctccg 1140 

aggggtcaag atcaagtcat ctagacagcg ggaactaggg ttgaatgcag acctgttcca 1200 

gccaactagt ctggtgcgat atccacgcct agaaggcacc gaccctgaag tactgtaccg 1260 

cagagctgtc ctcctgcaga gactcatcaa gatcctcgat agtgtcctgc accacctggt 1320 

acctgcctgg gaccacacac tgggcacctt cagtgagatt aagcaagtga agcagttcct 1380 

actgctgtcc cgccagcggc caggcctggt ggctcagtgc ctgcgtgact ctgagagcag 1440 

caagcccagc ttcatgccac gcctatacat caaccgccgt cttgccatgg aacaccgtgc 1500 

ctgcccctct cgagaccctg cctgcaagaa tgcagtcttc acccaggtat atgaaggcct 1560 

caagccctct gacaaatatg aaaagcccct ggactacagg tggcccatgc gctatgacca 1620 

gtggtgggag tgtaaattta ttgcagaagg catcattgac caagggggtg gtttccggga 1680 

cagcctggca gatatgtcag aagagctgtg ccctagctca gcggataccc ccgtgcccct 1740 

gcccttcttt gtacgcacag ccaaccaggg caatggcact ggtgaggctc gggacatgta 1800 

tgtacccaac ccctcctgcc gagactttgc caagtatgaa tggatcggac agctgatggg 1860 

ggctgccctt cggggtaagg agttcctggt cctggccctg cctggttttg tgtggaagca 1920 

gctttctggt gaggaggtga gctggagcaa ggacttccca gctgtggact ctgtgctggt 19 80 

gaagctcctg gaagtgatgg aaggaatgga caaggagacg tttgagttca agtttgggaa 2040 

ggaactaaca ttcaccactg tactgagtga ccaacaggtg gtggagctga tccctggggg 2100 

tgcaggcatc gtcgtgggat atggggaccg ttctcgtttc atccaactgg tccagaaggc 2160 

acggctagag gagagcaagg agcaggtggc agctatgcag gcaggtctgc tgaaggtggt 2220 

accacaggct gtgctggact tgctgacctg gcaagagttg gagaagaaag tgtgtgggga 2280 
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tccagaggtc actgtggatg ctctgcgcaa gctcacccgg tttgaggact tcgagccatc 2340 
tgactcgcgg gtgcagtatt tctgggaggc actgaacaac ttcaccaacg aggaccggag 2400 
ccgcgtcctg cgctttgtca cgggccgcag tcgcctgcca gcacggatct acatctaccc 2460 
agacaagctg ggctacgaga ccacagacgc gctgcccgag tcttccactt gctccagcac 2520 
cctcttcctg ccacactatg ccagtgccaa ggtatgcgag gagaagctcc gctatgcggc 2580 
ctacaactgc gtggccatcg acactgacat gagcccttgg gaggagtgag gcgtgccgcc 2640 
ggctgtggga ccagcaagac tgcacgtgtc cctcttggcc ttgcccaggg cgaagacacc 2700 
ttccctgccc tggtttggct gacgtgctca gcaaaacccc atgtgccctg ctcctgtgtg 2760 
cagttggggt aggggcagct ggcatggtca ggtaacacta gtggcccagc cccgcagacc 2820 
cacaagccct acccgtgctg gggcttgctt cccgaggtat ttcacctctt aagagggaat 2880 
cttccacaag cccagcacaa gctgccaggc ctgagctact tgaagggggc catctaggtc 2940 
cccaacccat ggactttgcc tccattttca gctccgcctt ttttctccta ttttctctct 3000 
ggctttcttc agccatgact cacaactaaa aacataaaac actggaggtt agtggaggcc 3060 
cctccccaag cagggagcct gggatgggca gggagtgata gccaaactcc ttggtcacct 3120 
gctccaagaa ggaagcagta gctgagcacc tgccctcaca tactgctctt ttcccctctc 3180 
cctccacacc agagatgtgg tgagctctgt tcttctacca acccagtctc aacacacaaa 3240 
gtgccaccac cttccctgac tcagaaccca catccactca atgtgaactc tactaccacg 3300 
acctccccat attcctcact tctccatcac ctccagcctg actccctgtc tgccctttca 3360 
cccccaagat tttgcacagg ttaaggccag ttatggcctt tttgaaatct gtaatagctc 3420 
ccctttcccc aactctaaag cctagacctt aaacctgttc ctagaactct ggcccccacc 3480 
attcctcagt gccacctttc tgctgctgaa aggccacagt gatgcccccc agtgtgaggc 3540 
gggaggtgtg ccctcttccc cagccaagcc tttttaccca ctccccaggt ggcagctatg 3600 
caggcaggtc tgctgaaggt ggtaccacag gctgtgctgg acttgctgac ctggcaagag 3660 
ttggagaaga aagtgtgtgg ggatccagag gtcactgtgg atgctctgcg caagctcacc 3720 
cggtttgagg acttcgagcc atctgactcg cgggtgcagt atttctggga ggcactgaac 3780 
aacttcacca acgaggaccg gagccgcttc ctgcgctttg tcacgggccg cagtcgcctg 3840 
ccagcacgga tctacatcta cccagacaag ctgggctacg agaccacaga cgcgctgccc 3900 
gagtcttcca cttgctccag caccctctct tgccagcaca ctgcgccgta taagtgagcg 3960 
agctcgtcc 3969 

<210> 52 

<211> 2537 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_ f eature 

<223> Incyte ID No: 7503507CB1 

<400> 52 

gggttcgggg gcgccgcgct gtgaggccgg ggcctagagc cagccgcggc cgcgcaggag 60 

gggcccaggg cccgcgctcg cccgcgtccc cgccttcctc ccgcgctcag ccccgcctcg 120 

gctcgctgcc cttggctctc gtcgccatgg cctccgtcgc ccaggagagc gcgggctcgc 180 

agcgccggct accgccgcgt cacggggcgc tgcgcgggct gctactgctc tgcctgtggc 240 

tgccaagcgg ccgtgcggcc ttgccgcccg cggcgccgct gtccgaactg cacgcgcagc 300 

tgtcgggcgt ggagcagctg ctggaggagt tccgccggca actgcagcag gagcggcctc 360 

aggaggagct ggagctggag ctgcgcgcgg gcggcggccc ccaggaggac tgcccgggcc 420 

ggggcagcgg cggctacagc gcaatgcctg acgccatcat ccgcaccaag gactccctgg 480 

cggcgggtgc cagcttcctg cgggcgccgg cggccgtgcg gggctggcgg caatgcgtgg 540 

cggcctgctg ctccgagccg cgctgctccg tggccgtggt ggagctgccc cggcgccccg 600 

cgcccccggc agccgtgctc ggctgctacc tcttcaactg cacggcgcgc ggccgcaacg 660 

tctgcaagtt cgcgctgcac .agcggctaca gcagctacag cctcagccgc gcgccggacg 720 

gcgccgccct ggccaccgcg cgcgcctcgc cccggcagga aaaggatgcg cctccactta 780 

gcaaggctgg gcaggatgtg gttctgcatc tgcccacaga cggggtggtt ctagacggcc 840 

gcgagagcac agatgaccac gccatcgtcc agtatgagtg ggcactgctg cagggggacc 900 

cgtcagtgga catgaaggtg cctcaatcag gaaccctgaa gctgtcccac ctacaggagg 960 
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gaacctacac cttccagctg accgtgacgg acactgccgg gcagagaagc tctgacaacg 1020 
tgtcagtgac agtgcttcgc gcagcctact ccacaggagg atgtttgcac acttgctcac 1080 
gctaccactt cttctgtgac gatggctgct gcattgacat cacgctcgcc tgcgatggag 1140 
tgcagcagtg tcctgatggg tctgatgaag acttctgcca gaatctgggc ctggaccgca 1200 
agatggtaac ccacacggca gctagtcctg ccctgccaag aaccacaggg ccgagtgaag 1260 
atgcaggggg tgactccttg gtggaaaagt ctcagaaagc cactgcccca aacaagccac 1320 
ctgcattatc aaacacagag aagaggaaag ttatatattt gagtcaaagg gtgatggagg 1380 
aggaggggaa cacccagccc cagaaacagg tgcagtgcta cccctggcgc tgggtttggc 1440 
tatcactgct ctgctgcttc tcatggttgc atgccgacta cgactggtga aacagaaact 1500 
gaaaaaagct cgtcccatta catctgagga atcggactac ctcataaatg ggatgtatct 1560 
atagtaatgt aatttcaata ccttggggca gggacatgtt ttgtttataa tttatacatc 1620 
tattaagttc tggatattta cagcttcttt tgtttttaat tgggccagaa gattctgcaa 1680 
atcccaaatc tttctttatt atttattgta aaaaaagttt ccttagaagt cataaaatat 1740 
tttgaaattt agagaggaat tcatgattaa agattcctaa aaatataatt ctgatttatg 1800 
taagctgtcc ctgaaaatag aaatgtgtac ttagctgaga gaaaattcag catctcagga 1860 
ggtggtatta ggatgactgt gttaacccat taccttttag aagccaactg ttggcccctt 1920 
accatgctgg actgctatag gcccagcttc cccttgttct gtggcccttt tcttcctcct 1980 
tgaagctccc agtattcttt ttcttttccc ctctaaacct gtttctgaga gtggatctca 2040 
agcaagttca tgccttcaat cagatgttac ttagggtggg tatacctaaa ttataaacct 2100 
tatgtacaag tcagtaagcc ttagggaagg tgagtgtggg tccttcctaa tccctctgac 2160 
gtcatgtcat ataggtggct gcctccttag actgaccttt gggagaaaaa aaccccagac 2220 
tttgaattag taacagctct aagatggtca tgcagtgaga taggaaatca agatggaagc 2280 
agagaatctg gcatgccaaa aactaacaga aacttagttg aaggcaaaga gagcaaggag 2340 
aaagtttaat acttcattac atcaaatcaa cactgctcca tggtgagagc acagcaactc 2400 
atttatatat atatatatag gctttgttga tgaaaaacaa caattgaaga gaggacgttg 2460 
agtggattcc tgggtacagc ttttgtaaaa atgtcaccat ggctttcatc caatggaatg 2520 
agtcgatgtt ttttaat 2537 

<210> 53 

<211> 2526 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7503506CB1 

<400> 53 

gggttcgggg gcgccgcgct gtgaggccgg ggcctagagc cagccgcggc cgcgcaggag 60 
gggcccaggg cccgcgctcg cccgcgtccc cgcctt'cctc ccgcgctcag ccccgcctcg 120 
gctcgctgcc cttggctctc gtcgccatgg cctccgtcgc ccaggagagc gcgggctcgc 180 
agcgccggct accgccgcgt cacggggcgc tgcgcgggct gctactgctc tgcctgtggc 240 
tgccaagcgg ccgtgcggcc ttgccgcccg cggcgccgct gtccgaactg cacgcgcagc 300 
tgtcgggcgt ggagcagctg ctggaggagt tccgccggca actgcagcag gagcggcctc 360 
aggaggagct ggagctggag ctgcgcgcgg gcggcggccc ccaggaggac tgcccgggcc 420 
ggggcagcgg cggctacagc gcaatgcctg acgccatcat ccgcaccaag gactccctgg 480 
cggcgggtgc cagcttcctg cgggcgccgg cggccgtgcg gggctggcgg caatgcgtgg 540 
cggcctgctg ctccgagccg cgctgctccg tggccgtggt ggagctgccc cggcgccccg 600 
cgcccccggc agccgtgctc ggctgctacc tcttcaactg cacggcgcgc ggccgcaacg 660 
tctgcaagtt cgcgctgcac agcggctaca gcagctacag cctcagccgc gcgccggacg 72 0 
gcgccgccct ggccaccgcg cgcgcctcgc cccggcagga aaaggatgcg cctccactta 780 
gcaaggctgg gcaggatgtg gttctgcatc tgcccacaga cggggtggtt ctagacggcc 840 
gcgagagcac agatgaccac gccatcgtcc agtatgagtg ggcactgctg cagggggacc 900 
cgtcagtgga catgaaggtg cctcaatcag gaaccctgaa gctgtcccac ctacaggagg 960 
gaacctacac cttccagctg accgtgacgg acactgccgg gcagagaagc tctgacaacg 1020 
tgtcagtgac agtgcttcgc gcagcctact ccacaggagg atgtttgcac acttgctcac 1080 
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gctaccactt cttctgtgac gatggctgct gcattgacat cacgctcgcc tgcgatggag 1140 
tgcagcagtg tcctgatggg tctgatgaag acttctgcca gaatctgggc ctggaccgca 1200 
agatggtaac ccacacggca gctagtcctg ccctgccaag aaccacaggg ccgagtgaag 1260 
atgcaggggg tgactccttg gtggaaaagt ctcagaaagc cactgcccca aacaagccac 1320 
ctgcattatc aaacacagag aagaggaatc attccgcctt ttggggacca gagagtcaaa 1380 
tcattcctgt gatgccaggt gcagtgctac ccctggcgct gggtttggct atcactgctc 1440 
tgctgcttct catggttgca tgccgactac gactggtgaa acagaaactg aaaaaagctc 1500 
gtcccattac atctgaggaa tcggactacc tcataaatgg gatgtatcta tagtaatgta 1560 
atttcaatac cttggggcag ggacatgttt tgtttataat ttatacatct attaagttct 1620 
ggatatttac agcttctttt gtttttaatt gggccagaag attctgcaaa tcccaaatct 1680 
ttctttatta tttattgtaa aaaaagtttc cttagaagtc ataaaatatt ttgaaattta 1740 
gagaggaatt catgattaaa gattcctaaa aatataattc tgatttatgt aagctgtccc 1800 
tgaaaataga aatgtgtact tagctgagag aaaattcagc atctcaggag gtggtattag 1860 
gatgactgtg ttaacccatt accttttaga agccaactgt tggcccctta ccatgctgga 1920 
ctgctatagg cccagcttcc ccttgttctg tggccctttt cttcctcctt gaagctccca 1980 
gtattctttt tcttttcccc tctaaacctg tttctgagag tggatctcaa gcaagttcat 2040 
gccttcaatc agatgttact tagggtgggt atacctaaat tataaacctt atgtacaagt 2100 
cagtaagcct tagggaaggt gagtgtgggt ccttcctaat ccctctgacg tcatgtcata 2160 
taggtggctg cctccttaga ctgacctttg ggagaaaaaa accccagact ttgaattagt 2220 
aacagctcta agatggtcat gcagtgagat aggaaatcaa gatggaagca gagaatctgg 2280 
catgccaaaa actaacagaa acttagttga aggcaaagag agcaaggaga aagtttaata 2340 
cttcattaca tcaaatcaac actgctccat ggtgagagca cagcaactca tttatatata 2400 
tatatatagg ctttgttgat gaaaaacaac aattgaagag aggacgttga gtggattcct 2460 
gggtacagct tttgtaaaaa tgtcaccatg gctttcatcc aatggaatga gtcgatgttt 2520 
tttaat 2526 

<210> 54 
<211> 2464 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_f eature 
•<223> Incyte ID No: 7503509CB1 

<400> 54 

gggttcgggg gcgccgcgct gtgaggccgg ggcctagagc cagccgcggc cgcgcaggag 60 
gggcccaggg cccgcgctcg cccgcgtccc cgccttcctc ccgcgctcag ccccgcctcg 120 
gctcgctgcc cttggctctc gtcgccatgg cctccgtcgc ccaggagagc gcgggctcgc 180 
agcgccggct accgccgcgt cacggggc'gc tgcgcgggct gctactgctc tgcctgtggc 240 
tgccaagcgg ccgtgcggcc ttgccgcccg cggcgccgct gtccgaactg cacgcgcagc 300 
tgtcgggcgt ggagcagctg ctggaggagt tccgccggca actgcagcag gagcggcctc 360 
aggaggagct ggagctggag ctgcgcgcgg gcggcggccc ccaggaggac tgcccgggcc 420 
cgggcagcgg cggctacagc gcaatgcctg acgccatcat ccgcaccaag gactccctgg 480 
cggcgggtgc cagcttcctg cgggcgccgg- cggccgtgcg gggctggcgg caatgcgtgg 540 
cggcctgctg ctccgagccg cgctgctccg tggccgtggt ggagctgccc cggcgccccg 600 
cgcccccggc agccgtgctc ggctgctacc tcttcaactg cacggcgcgc ggccgcaacg 660 
tctgcaagtt cgcgctgcac agcggctaca gcagctacag cctcagccgc gcgccggacg 720 
gcgccgccct ggccaccgcg cgcgcctcgc cccggcaggg tgcctcaatc aggaaccctg 780 
aagctgtccc acctacagga gggaacctac accttccagc tgaccgtgac ggacactgcc 840 
gggcagagaa gctctgacaa cgtgtcagtg acagtgcttc gcgcagccta ctccacagga 900 
ggatgtttgc acacttgctc acgctaccac ttcttctgtg acgatggctg ctgcattgac 960 
atcacgctcg cctgcgatgg agtgcagcag tgtcctgatg ggtctgatga agacttctgc 1020 
cagaatctgg gcctggaccg caagatggta acccacacgg cagctagtcc tgccctgcca 1080 
agaaccacag ggccgagtga agatgcaggg ggtgactcct tggtggaaaa gtctcagaaa 1140 
gccactgccc caaacaagcc acctgcatta tcaaacacag agaagaggaa tcattccgcc 1200 
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ttttggggac cagagagtca aatcattcct gtgatgccag atagtagttc ctcagggaag 1260 
aacagaaaag aggaaagtta tatatttgag tcaaagggtg atggaggagg aggggaacac 1320 
ccagccccag aaacaggtgc agtgctaccc ctggcgctgg gtttggctat cactgctctg 1380 
ctgcttctca tggttgcatg ccgactacga ctggtgaaac agaaactgaa aaaagctcgt 1440 
cccattacat ctgaggaatc ggactacctc ataaatggga tgtatctata gtaatgtaat 1500 
ttcaatacct tggggcaggg acatgttttg tttataattt atacatctat taagttctgg 1560 
atatttacag cttcttttgt ttttaattgg gccagaagat tctgcaaatc ccaaatcttt 1620 
ctttattatt tattgtaaaa aaagtttcct tagaagtcat aaaatatttt gaaatttaga 1680 
gaggaattca tgattaaaga ttcctaaaaa tataattctg atttatgtaa gctgtccctg 1740 
aaaatagaaa tgtgtactta gctgagagaa aattcagcat ctcaggaggt ggtattagga 1800 
tgactgtgtt aacccattac cttttagaag ccaactgttg gccccttacc atgctggact 1860 
gctataggcc cagcttcccc ttgttctgtg gcccttttct tcctccttga agctcccagt 1920 
attctttttc ttttcccctc taaacctgtt tctgagagtg gatctcaagc aagttcatgc 1980 
cttcaatcag atgttactta gggtgggtat acctaaatta taaaccttat gtacaagtca 2040 
gtaagcctta gggaaggtga gtgtgggtcc ttcctaatcc ctctgacgtc atgtcatata 2100 
ggtggctgcc tccttagact gacctttggg agaaaaaaac cccagacttt gaattagtaa 2160 
cagctctaag atggtcatgc agtgagatag gaaatcaaga tggaagcaga gaatctggca 2220 
tgccaaaaac taacagaaac ttagttgaag gcaaagagag caaggagaaa gtttaatact 2280 
tcattacatc aaatcaacac tgctccatgg tgagagcaca gcaactcatt tatatatata 2340 
tatataggct ttgttgatga aaaacaacaa ttgaagagag gacgttgagt ggattcctgg 2400 
gtacagcttt tgtaaaaatg tcaccatggc tttcatccaa tggaatgagt cgatgttttt 2460 
taat 2464 



<210> 55 
<211> 1452 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 7505800CB1 



<400> 55 

ccgaggcgag atggcggcca ccgagggggt 
gcagccggcg caacccccgc cccagccgca 
agagatggcg gccgaggctg gggaagccgt 
cctggactcg ccctcctatg tcctgtacag 
tcagaaggat ctacatgagg aaatgaacta 
aaactatcaa gtttggcatc ataggcgagt 
ggagcttgaa tttattgctg atattcttaa 
gcatcgacaa tgggttattc aggaatttaa 
ccaacttctg aaagaggatg tgagaaataa 
ttctaacacc actggctaca atgatcgtgc 
ggaaatgatt aaactagtac cacataatga 
gcaggatcgt ggtctttcca aatatcctaa 
aagtcatagt tccccctacc taattgcctt 
aaatcagtgt gacaataagg aagacattct 
agctaaagaa aaggacacta taagaaagga 
aagcaaacac agcacagaaa atgactcacc 
agaacttgat ggaatgcttt tattttttat 
gtggtccttc cctttgcctg tggtgtaaaa 
gaactgatgc tccttgggtg ctgctgctac 
taaagcaaag tcattggatg ggaggaggaa 
cttatcaaca tataatctaa tcccttagca 
agatttgtag cagtaataac tgcaggtcac 
tttggttcag taagcaggga atacagtcgt 



cggggaggct gcgcaagggg gcgagcccgg 60 
cccaccgccg ccccagcagc agcacaagga 120 
ggcgtccccc atggacgacg ggtttgtgag 180 
gcatttccgg agagttcttt tgaagtcact 240 
catcactgca ataattgagg agcagcccaa 300 
attagtggaa tggctaagag atccatctca 360 
tcaggatgca aagaattatc atgcctggca 420 
actttgggat aatgagctgc agtatgtgga 480 
ctctgtctgg aaccaaagat acttcgttat 540 
tgtattggag agagaagtcc aatacactct 600 
aagtgcatgg aactatttga aagggatttt 660 
tctgttaaat caattacttg atttacaacc 720 
tcttgtggat atctatgaag acatgctaga 780 
taataaagca ttagagttat gtgaaatcct 840 
atattggaga tacattggaa gatcccttca 900 
aacaaatgta cagcaataac accatccaga 960 
taagggaccc tgcaggagtt tcacacgaga 1020 
gtgcatcaca caggtattgc tttttaacaa 1080 
tcagactagc tctaagtaat gtgattcttc 1140 
gaaaaagtcc cataaaggaa cttttgtagt 1200 
tcagctcctc cctcagtggt acatgcgtca 1260 
ttgtatgtaa tggatgtgag gtagccgaag 1320 
tccatcagag ctggtctgca cactcacatt 1380 
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atcttgctat cactgtaacc aactaatgcc aaaagaacgg ttttgtaata .aaattatagc 1440 
tgtatctaaa aa 1452 

<210> 56 

<211> 1802 

<212> DNA 

<213>. Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 750314.1CB1 

<400> 56 

cgccggtgcc gggcgaacat ggcggcggcc accggaccct cgttttggct ggggaatgaa 60 
accctgaagg tgccgctggc gctctttgcc ttgaaccggc agcgcctgtg tgagcggctg 120 
cggaagaacc ctgctgtgca ggccggctcc atcgtggtcc tgcagggcgg ggaggagact 180 
cagcgctact gcaccgacac cggggtcctc ttccgccagg agtccttctt tcactgggcg 240 
ttcggtgtca ctgagccagg ctgctatggt gtcatcgatg ttgacactgg gaagtcgacc 300 
ctgtttgtgc ccaggcttcc tgccagccat gccacctgga tgggaaagat ccattccaag 360 
gagcacttca aggagaagta tgccgtggac gacgtccagt acgtagatga gattgccagc 420 
gtcctgacgt cacagaagcc ctctgtcctc ctcactttgc gtggcgtcaa cacggacagc 480 
ggcagtgtct gcagggaggc ctcctttgac ggcatcagca agttcgaagt caacaatacc 540 
attcttcacc cagagatcgt tgagtgcctc ttcgagcact actgctactc ccggggcggc 600 
atgcgccaca gctcctacac ctgcatctgc ggcagtggtg agaactcagc cgtgctacac 660 
tacggacacg ccggagctcc caacgaccga acgatccaga atggggatat gtgcctgttc 720 
gacatgggcg gtgagtatta ctgcttcgct tccgacatca cctgctcctt tcccgccaac 780 
ggcaagttca ctgcagacca gaaggccgtc tatgaggcag tgctgcggag ctcccgtgcc 840 
gtcatgggtg ccatgaagcc aggtgtctgg tggcctgaca tgcaccgcct ggctgaccgc 900 
atccacctgg aggagctggc ccacatgggc atcctgagcg gcagcgtgga cgccatggtc 960 
caggctcacc tgggggccgt gtttatgcct cacgggcttg gccacttcct gggcattgac 1020 
gtgcacgacg tgggaggcta cccagagggc gtggagcgca tcgacgagcc cggcctgcgg 1080 
agcctgcgca ctgcacggca cctgcagcca ggcatggtgc tcaccgtgga gccgggcatc 1140 
tacttcatcg accacctcct ggatgaggcc ctggcggacc cggcccgcgc ctccttcctt 1200 
aaccgcgagg tcctgcagcg ctttcgcggt tttggcgggg tccgcatcga ggaggacgtc 1260 
gtggtgactg acagcggcat agagctgctg acctgcgtgc cccgcactgt ggaagagatt 1320 
gaagcatgca tggcaggctg tgacaaggcc tttaccccct tctctggccc caagtagagc 1380 
cagccagaaa tcccagcgca cctgggggcc tggccttgca acctcttttc gtgatgggca 1440 
gcctgctggt cagcactcca gtagcgagag acggcaccca gaatcagatc ccagcttcgg 1500 
catttgatca gaccaaacag tgctgtttcc cggggaggaa acactttttt aattaattac 1560 
ccttttgcag gctcccacct ttaatctgtt ttataccttg cttattaaat gagcgactta 1620 
aaatgattga aaataatgct gttctttagt agcaactaaa atgtgtcttg ctgtcattta 1680 
tattcctttt cccaggaaag aagcatttct gatactttct gtcaaaaatc aatatgcaga 1740 
atggcatttg caataaaagg tttcctaaaa tgaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1800 

1802 

<210> 57 
<211> 1833 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7500362CB1 

<400> 57 

cgggcgaaca tggcggcggc caccggaccc tcgttttggc tggggaatga aaccctgaag 60 
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gtgccgctgg cgctctttgc cttgaaccgg cagcgcctgt gtgagcggct gcggaagaac 120 
cctgctgtgc aggccggctc catcgtgtcc ttctttcact gggcgttcgg tgtcactgag 180 
ccaggctgct atggtgtcat cgatgttgac actgggaagt cgaccctgtt tgtgcccagg 240 
cttcctgcca gccatgccac ctggatggga aagatccatt ccaaggagca cttcaaggag 300 
aagtatgccg tggacgacgt ccagtacgta gatgagattg ccagcgtcct gacgtcacag 360 
aagccctctg tcctcctcac tttgcgtggc gtcaacacgg acagcggcag tgtctgcagg 420 
gaggcctcct ttgacggcat cagcaagttc gaagtcaaca ataccattct tcacccagag 480 
atcgttgagt gccgagtgtt taagacggat atggagctgg aggttctgcg ctataccaat 540 
aaaatctcca gcgaggccca ccgtgaggta atgaaggctg taaaagtggg aatgaaagaa 600 
tatgagttgg aaagcctctt cgagcactac tgctactccc ggggcggcat gcgccacagc 660 
tcctacacct gcatctgcgg cagtggtgag aactcagccg tgctacacta cggacacgcc 720 
ggagctccca acgaccgaac gatccagaat ggggatatgt gcctgttcga catgggcggt 780 
gagtattact gcttcgcttc cgacatcacc tgctcctttc ccgccaacgg caagttcact 840 
gcagaccaga aggccgtcta tgaggcagtg ctgcggagct cccgtgccgt catgggtgcc 900 
atgaagccag gtgtctggtg gcctgacatg caccgcctgg ctgaccgcat ccacctggag 960 
gagctggccc acatgggcat cctgagcggc agcgtggacg ccatggtcca ggctcacctg 1020 
ggggccgtgt ttatgcctca cgggcttggc cacttcctgg gcattgacgt gcacgacgtg 1080 
ggaggctacc cagagggcgt ggagcgcatc gacgagcccg gcctgcggag cctgcgcact 1140 
gcacggcacc tgcagccagg catggtgctc accgtggagc cgggcatcta cttcatcgac 1200 
cacctcctgg atgaggccct ggcggacccg gcccgcgcct ccttccttaa ccgcgaggtc 1260 
ctgcagcgct ttcgcggttt tggcggggtc cgcatcgagg aggacgtcgt ggtgactgac 1320 
agcggcatag agctgctgac ctgcgtgccc cgcactgtgg aagagattga agcatgcatg 13 80 
gcaggctgtg acaaggcctt tacccccttc tctggcccca agtagagcca gccagaaatc 1440 
ccagcgcacc tgggggcctg gccttgcaac ctcttttcgt gatgggcagc ctgctggtca 1500 
gcactccagt agcgagagac ggcacccaga atcagatccc agcttcggca tttgatcaga 1560 
ccaaacagtg ctgtttcccg gggaggaaac acttttttaa ttaccctttt gcaggctccc 1620 
acctttaatc tgttttatac cttgcttatt aaatgagcga cttaaaatga ttgaaaataa 1680 
tgctgttctt tagtagcaac taaaatgtgt cttgctgtca tttatattcc ttttcccagg 1740 
aaagaagcat ttctgatact ttctgtcaaa aatcaatatg cagaatggca tttgcaataa 1800 
aaggtttcct aaaatggtca aaaaaaaaaa aaa 1833 

<210> 58 
<211> 2465 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7503328CB1 



<400> 58 

ggcccttccg cgggtgatca gctggtctgc 
cgccgaatgg cagcctccag aaagccaccg 
ctgagaaatt taagaataat tgaacctaac 
gaaacagacg gcagaatgcc tccaaaggtg 
gccatgagga actctgagta tgtgaccgaa 
gatgctcatc agagtgagta tattgctcca 
ttcgatggct ctgcgggcac agccatcatc 
gggcgctact ttctccaggc tgccaagcaa 
ggtctgaagg acacaccaac tcaggaagac 
agggttggtg tggacccctt gatcattcct 
ctgagaagtg ccggccatca cctcattcct 
acagaccgtc ctgagcgccc ttgcaagcct 
atctcctgga aggacaaggt tgcagacctt 
tggtttgtgg tcactgcctt ggatgagatt 
gtggagcaca atccagtatt tttctcctac 



gctcccctga cgtgggctgg ggcacgtcac 60 
cgagtaaggg tgaatcacca ggattttcaa 120 
gaggtgacac actcaggaga cacaggtgtg 180 
acttcagagc tgcttcggca gctgagacaa 240 
ccgatccagg cctacatcat cccatcggga 300 
tgtgactgtc ggcgggcttt tgtctctgga 360 
acagaagagc atgcagccat gtggactgac 420 
atggacagca actggacact tatgaagatg 480 
tggctggtga gtgtgcttcc tgaaggatcc 540 
acagattatt ggaagaaaat ggccaaagtt 600 
gtcaaggaga acctcgttga caaaatctgg 660 
ctcctcacac tgggcctgga ttacacaggc 720 
cggttgaaaa tggctgagag gaacgtcatg 780 
gcgtggctat ttaatctccg aggatcggat 840 
gcaatcatag gactagagac gatcatgctc 900 
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ttcattgatg gtgaccgcat agacgccccc agtgtgaagg agcacctgct tcttgacttg 960 
ggtctggaag ccgagtacag gatccaggtg catccctaca agtccatcct gagcgagctc 1020 
aaggccctgt gtgctgacct ctccccaagg gagaaggtgt gggtcagtga caaggccagc 1080 
tatgctgtga gcgagaccat ccccaaggac caccgctgct gtatgcctta cacccccatc 1140 
tgcatcgcca aagctgtgaa gaattcagct gagtcagaag gcatgaggcg ggctcacatt 1200 
aaagatgctg ttgctctctg tgaactcttt aactggctgg agaaagaggt tcccaaaggt 1260 
ggtgtgacag agatctcagc tgctgacaaa gctgaggagt ttcgcaggca acaggcagac 1320 
tttgtggacc tgagcttccc aacaatttcc agccagtccc tgagacgaat aggaccttgt 1380 
ccctggatga ggtgtacctt attgactcgg gtgctcaata caaggatggc accacagatg 1440 
tgacgcggac aatgcatttt gggaccccta cagcctacga gaaggaatgc ttcacatatg 1500 
tcctcaaggg ccacatagct gtgagtgcag ccgttttccc gactggaacc aaaggtcacc 1560 
ttcttgactc ctttgcccgt tcagctttat gggattcagg cctagattac ttgcacggga 1620 
ctggacatgg tgttgggtct tttttgaatg tccatgaggg tccttgcggc atcagttaca 1680 
aaacattctc tgatgagccc ttggaggcag gcatgattgt cactgatgag cccgggtact 1740 
atgaagatgg ggcttttgga attcgcattg agaatgttgt ccttgtggtt cctgtgaaga 1800 
ccaagtataa ttttaataac cggggaagcc tgacctttga acctctaaca ttggttccaa 1860 
ttcagaccaa aatgatagat gtggattctc ttacagacaa agagtgcgac tggctcaaca 1920 
attaccacct gacctgcagg gatgtgattg ggaaggaatt gcagaaacag ggccgccagg 1980 
aagctctcga gtggctcatc agagagacgc aactcatctc caaacagcat taataaatac 2040 
ctccccggtt ttgtttttgt aaaatgctct ggaggaagga agaaacgtgg cagatccctg 2100 
acatctttcc cctttccttt ccttcttccc cacctcccct ttttacttta gactttaaga 2160 
agaacagaaa atcttcttat cctctttgat attttattgc aaacactcag tcttttatga 2220 
ttttttaatt gttgagaaca agccaagaat aaaattgctg caccagaagg agggtccctc 2280 
caaagttgaa cacttggtga aaggaagatg ccccgacttc tttggccagt gatggggaat 2340 
cagtgagtgc tccatgatgg tcatgttcca ggtgctagta catcattcat gatcacctta 2400 
atgctcatga gactatattt atgatcagtg aataaaaatg tcagaactgt gaaaaaaaaa 2460 

2465 

aaaaa 



<210> 59 
<211> 2560 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7510464CB1 



<400> 59 

ggcccttccg cgggtgatca gctggtctgc 
cgccgaatgg cagcctccag aaagccaccg 
ctgagaaatt taagaataat tgaacctaac 
gaaacagacg gcagaatgcc tccaaaggtg 
gccatgagga actctgagta tgtgaccgaa 
gatgctcatc agagtgagta tattgctcca 
ttcgatggct ctgcgggcac agccatcatc 
gggcgctact ttctccaggc tgccaagcaa 
ggtctgaagg acacaccaac tcaggaagac 
agggttggtg tggacccctt gatcattcct 
ctgagaagtg ccggccatca cctcattcct 
acagaccgtc ctgagcgccc ttgcaagcct 
atctcctgga aggacaaggt tgcagacctt 
tggtttgtgg tcactgcctt ggatgagatt 
gtggagcaca atccagtatt tttctcctac 
ttcattgatg gtgaccgcat agacgccccc 
ggtctggaag ccgaatacag gatccaggtg 
aaggccctgt gtgctgacct ctccccaagg 



gctcccctga cgtgggctgg ggcacgtcac 60 
cgagtaaggg tgaat caeca ggattttcaa 120 
gaggtgacac actcaggaga cacaggtgtg 180 
acttcagagc tgetteggea gctgagacaa 240 
ccgatccagg cctacatcat cccatcggga 300 
tgtgactgtc ggcgggcttt tgtctctgga 360 
acagaagagc atgeagecat gtggactgac 420 
atggacagca actggacact tatgaagatg 480 
tggctggtga gtgtgcttcc tgaaggatcc 540 
acagattatt ggaagaaaat ggccaaagtt 600 
gtcaaggaga acctcgttga caaaatctgg 660 
ctcctcacac tgggcctgga ttacacaggc 720 
cggttgaaaa tggctgagag gaacgtcatg 780 
gcgtggctat ttaatctccg aggatcagat 840 
gcaatcatag gactagagac gatcatgetc 900 
agtgtgaagg agcacctgct tcttgacttg 960 
catccctaca agtccatcct gagcgagctc 1020 
gagaaggtgt gggtcagtga caaggccagc 1080 
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tatgctgtga gcgagaccat ccccaaggac caccgctgct gtatgcctta cacccccatc 1140 
tgcatcgcca aagctgtgaa gaattcagct gagtcagaag gcatgaggcg ggctcacatt 1200 
aaagatgctg ttgctctctg tgaactcttt aactggctgg agaaagaggt tcccaaaggt 1260 
ggtgtgacag agatctcagc tgctgacaaa gctgaggagt ttcgcaggca acaggcagac 1320 
tttgtggacc tgagcttccc aacaatttcc agtacgggac ccaacggcgc catcattcac 1380 
tacgcgccag tccctgagac gaataggacc ttgtccctgg atgaggtgta ccttattgac 1440 
tcgggtgctc aatacaagga tggcaccaca gatgtgacgc ggacaatgca ttttgggacc 1500 
cctacagcct acgagaagga atgcttcaca tatgtcctca agggccacat agctgtgagt 1560 
gcagccgttt tcccgactgg aaccaaaggt caccttcttg actcctttgc ccgttcagct 1620 
ttatgggatt caggcctaga ttacttgcac gggactggac atggtgttgg gtcttttttg 1680 
aatgtccatg agggtccfctg cggcatcagt tacaaaacat tctctgatga gcccttggag 1740 
gcaggcatga ttgtcactga tgagcccggg tactatgaag atggggcttt tggaattcgc 1800 
attgagaatg ttgtccttgt ggttcctgtg aagaccaagt ataattttaa taaccgggga 1860 
agcctgacct ttgaacctct aacattggtt ccaattcaga ccaaaatgat agatgtggat 1920 
tctcttacag acaaagagga gctgtggaat gggattctcc cagctagaag cctcttctgc 19 80 
ctgttccagt tcacagtgcg actggctcaa caattaccac ctgacctgca gggatgtgat 2040 
tgggaaggaa ttgcagaaac agggccgcca ggaagctctc gagtggctca tcagagagac 2100 
gcaacccatc tccaaacagc attaataaat acctccccgg ttttgttttt gtaaaatgct 2160 
ctggaggaag gaagaaacgt ggcagatccc tgacatcttt cccctttcct ttccttcttc 2220 
cctacctccc ctttttactt tagactttaa gaagaacaga aaatcttctt atcctctttg 2280 
atattttatt gcaaacactc agtcttttat gattttttaa ttgttgagaa caagccaaga 2340 
ataaaattgc tgcaccagaa ggagggtccc tccaaagttg aacacttggt gaaaggaaga 2400 
tgccccgact tctttggcca gtgatgggga atcagtgagt gctccatgat ggtcatgttc 2460 
caggtgctag tacatcattc atgatcacct taatgctcat gagactatat ttatgatcag 2520 
tgaataaaaa tgtcagaact gtgaaaaaaa aaaaaaaaaa 2560 

<210> 60 

<211> 2254 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7510394CB1 

<400> 60 

gcaggcatgg cggcggctat gccgcttgct ctgctcgtcc tgttgctcct ggggcccggc 60 
ggctggtgcc ttgcagaacc cccacgcgac agcctgcggg aggaacttgt catcaccccg 120 
ctgccttccg gggacgtagc cgccacattc cagttccgca cgcgctggga ttcggagctt 180 
cagcgggaag gagtgtccca ttacaggctc tttcccaaag ccctggggca gctgatctcc 240 
aagtattctc tacgggagct gcacctgtca ttcacacaag gcttttggag gacccgatac 300 
tgggggccac ccttcctgca ggccccatca ggtgcagagc tgtgggtctg gttccaagac 360 
actgtcactg agtttagcag ccagctgtgg actttgaaag agggagcaga ggtagcccca 420 
ggacagtgag tggatttgtg tctctatcca gtgtggataa atcttggaag gagctcagta 480 
atgtcctctc agggatcttc tgcgcctctc tcaacttcat cgactccacc aacacagtca 540 
ctcccactgc ctccttcaaa cccctgggtc tggccaatga cactgaccac tactttctgc 600 
gctatgctgt gctgccgcgg gaggtggtct gcaccgaaaa cctcaccccc tggaagaagc 660 
tcttgccctg tagttccaag gcaggcctct ctgtgctgct gaaggcagat cgcttgttcc 720 
acaccagcta ccactcccag gcagtgcata tccgccctgt ttgcagaaat gcacgctgta 780 
ctagcatctc ctgggagctg aggcagaccc tgtcagttgt atttgatgcc ttcatcacgg 840 
ggcagggaaa gaaagactgg tccctcttcc ggatgttctc ccgaaccctc acggagccct 900 
gccccctggc ttcagagagc cgagtctatg tggacatcac cacctacaac caggacaacg 960 
agacattaga ggtgcaccca cccccgacca ctacatatca ggacgtcatc ctaggcactc 1020 
ggaagaccta tgccatctat gacttgcttg acaccgccat gatcaacaac tctcgaaacc 1080 
tcaacatcca gctcaagtgg aagagacccc cagagaatga ggccccccca gtgcccttcc 1140 
tgcatgccca gcggtacgtg agtggctatg ggctgcagaa gggggagctg agcacactgc 1200 
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tgtacaacac ccacccatac cgggccttcc cggtgctgct gctggacacc gtaccctggt 1260 
atctgcggct gtatgtgcac accctcacca tcacctccaa gggcaaggag aacaaaccaa 1320 
gttacatcca ctaccagcct gcccaggacc ggctgcaacc ccacctcctg gagatgctga 1380 
ttcagctgcc ggccaactca gtcaccaagg tttccatcca gtttgagcgg gcgctgctga 1440 
agtggaccga gtacacgcca gatcctaacc atggcttcta tgtcagccca tctgtcctca 1500 
gcgcccttgt gcccagcatg gtagcagcca agccagtgga ctgggaagag agtcccctct 1560 
tcaacagcct gttcccagtc tctgatggct ctaactactt tgtgcggctc tacacggagc 1620 
cgctgctggt gaacctgccg acaccggact tcagcatgcc ctacaacgtg atctgcctca 1680 
cgtgcactgt ggtggccgtg tgctacggct ccttctacaa tctcctcacc cgaaccttcc 1740 
acatcgagga gccccgcaca ggtggcctgg ccaagcggct ggccaacctt atccggcgcg 1800 
cccgaggtgt ccccccactc tgattcttgc cctttccagc agctgcagct gccgtttctc 1860 
tctggggagg ggagcccaag ggctgtttct gccacttgct ctcctcagag ttggcttttg 1920 
aaccaaagtg ccctggacca ggtcagggcc tacagctgtg ttgtccagta caggagccac 1980 
gagccaaatg tggcatttga atttgaatta acttagaaat tcatttcctc acctgtagtg 2040 
gccacctcta tattgaggtg ctcaataagc aaaagtggtc ggtggctgct gtattggaca 2100 
gcacagaaaa agatttccat caccacagaa aggtcggctg gcagcactgg ccaaggtgat 2160 
ggggtgtgct acacagtgta tgtcactgtg tagtggatgg agtttactgt ttgtggaata 2220 
aaaacggctg tttccgtgga aaaaaaaaaa aaaa 2254 

<210> 61 
<211> 2139 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7500745CB1 

<220> 

<221> unsure 
<222> 2126 

<223> a, t, c, g, or other 
<400> 61 

gccgcaggca tggcggcggc tatgccgctt gctctgctcg tcctgttgct cctggggccc 60 
ggcggctggt gccttgcaga acccccacgc gacagcctgc gggaggaact tgtcatcacc 120 
ccgctgcctt ccggggacgt agccgccaca ttccagttcc gcacgcgctg ggattcggag 180 
cttcagcggg aaggagtgtc ccattacagg ctctttccca aagccctggg gcagctgatc 240 
tccaagtatt ctctacggga gctgcacctg tcattcacac aaggcttttg gaggacccga 300 
tactgggggc cacccttcct gcaggcccca tcagtgtgga taaatcttgg aaggagctca 360 
gtaatgtcct ctcagggatc ttctgcgcct ctctcaactt catcgactcc accaacacag 420 
tcactcccac tgcctccttc aaacccctgg gtctggccaa tgacactgac cactactttc 480 
tgcgctatgc tgtgctgccg cgggaggtgg tctgcaccga aaacctcacc ccctggaaga 540 
agctcttgcc ctgtagttcc aaggcaggcc tctctgtgct gctgaaggca gatcgcttgt 600 
tccacaccag ctaccactcc caggcagtgc atatccgccc tgtttgcaga aatgcacgct 660 
gtactagcat ctcctgggag ctgaggcaga ccctgtcagt tgtatttgat gccttcatca 720 
cggggcaggg aaagaaagac tggtccctct tccggatgtt ctcccgaacc ctcacggagc 780 
cctgccccct ggcttcagag agccgagtct atgtggacat caccacctac aaccaggaca 840 
acgagacatt agaggtgcac ccacccctga ccactacata tcaggacgtc atcctaggca 900 
ctcggaagac ctatgccatc tatgacttgc ttgacaccgc catgatcaac aactctcgaa 960 
acctcaacat ccagctcaag tggaagagac ccccagagaa tgaggccccc ccagtgccct 1020 
tcctgcatgc ccagcggtac gtgagtggct atgggctgca gaagggggag ctgagcacac 1080 
tgctgtacaa cacccaccca taccgggcct tcccggtgct gctgctggac accgtaccct 1140 
ggtatctgcg gctgtatgtg cacaccctca ccatcacctc caagggcaag gagaacaaac 1200 
caagttacat ccactaccag cctgcccagg accggctgca accccacctc ctggagatgc 1260 
tgattcagct gccggccaac tcagtcacca aggtttccat ccagtttgag cgggcgctgc 1320 
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tgaagtggac cgagtacacg ccagatccta accatggctt ctatgtcagc ccatctgtcc 1380 

tcagcgccct tgtgcccagc atggtagcag ccaagccagt ggactgggaa gagagtcccc 1440 

tcttcaacag cctgttccca gtctctgatg gctctaacta ctttgtgcgg ctctacacgg 1500 

agccgctgct ggtgaacctg ccgacaccgg acttcagcat gccctacaac gtgatctgcc 1560 

tcacgtgcac tgtggtggcc gtgtgctacg gctccttcta caatctcctc acccgaacct 1620 

tccacatcga ggagccccgc acaggtggcc tggccaagcg gctggccaac cttatccggc 1680 

gcgcccgagg tgtcccccca ctctgattct tgccctttcc agcagctgca gctgccgttt 1740 

ctctctgggg aggggagccc aagggctgtt tctgccactt gctctcctca gagttggctt 1800 

ttgaaccaaa gtgccctgga ccaggtcagg gcctacagct gtgttgtcca gtacaggagc 1860 

cacgagccaa atgtggcatt tgaatttgaa ttaacttaga aattcatttc ctcacctgta 1920 

gtggccacct ctatattgag gtgctcaata agcaaaagtg gtcggtggct gctgtattgg 1980 

acagcacaga aaaagatttc catcaccaca gaaaggtcgg ctggcagcac tggccaaggt 2040 

gatggggtgt gctacacagt gtatgtcact gtgtagtgga tggagtttac tgtttgtgga 2100 

ataaaaacgg ctgtttccgt gaaaanaaaa aaaaaaagg 2139 

<210> 62 

<211> 648 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 7500929CB1 

<400> 62 

gtggaagaga cgcgccggcg ttggccgctg ctgctagcag cttgaacccc agggtcggga 60 
ccgatgtcgg cttgggctgc tgccagccta agcagggccg ctgcccgatg cttgctggca 120 
cgaggccccg gggtcagggc ggctcctccg cgcgaccccc ggccctccca ccccgagccc 180 
cggggctgcg gtgccgctcc gggcaggacg ctgcacttta ccgcggctgt ccccgccggg 240 
cacaacaagt ggtccaaagt caggcacatc aagggtccga aggacgtcga aaggagtcgc 300 
atcttctcca aactctgttt gaacatccgc ctggcagtga aagccaggag gcccaaggac 360 
aggacttgcg accttgaagc caaaggaatc tcacttgtgg ggcctccttg tcagctctgc 420 
tgctgtctca gagccatctg gatgagtgtc ccgacaccct ctcggatgca gggcaggacc 480 
acccagctgg tcagactctg atgttgggta gctggcctct gtggggattg taagtgccct 540 
gaggcgctct gtactagaaa ctgctcttaa taataacggt gattattggt tgotgcaaaa 600 
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 648 
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